The goal is to reduce TB prevalence in the population by proposing a prophylactic treatment of a given duration to people recently exposed to a confirmed TB case. The shorter the proposed duration, the lower the efficacy of the treatment but, also – we hypothesize –, the higher the uptake rate. The question then is whether a higher uptake rate can epidemiologically (over-)compensate at the population level a lower efficacy of the treatment at the individual level. If yes, then we are also interested in knowing what is the proposed duration of the treatment that would lead to the lowest TB prevalence in the population.
The model contains 2 sub-models. The first one is an epidemiological model of TB transmission in the population from which we derive a formula of the prevalence of TB in the population. The second one is a model of prophylactic treatment policy that accounts for
The distribution of the effective treatment duration as a function of the proposed treatment duration is itself derived from assumptions made on how treatment adherence depends on the proposed treatment duration.
The efficacy of the prophylactic treatment policy is then integrated into the formula of the TB prevalence in order to derive the relationship between the proposed duration of the prophylactic treatment and the TB prevalance in the population.
We develop a simple 5-parameters epidemiological model of TB that considers 3 clinical status (Figure 1):
In this model people can transit between \(S\) and \(I\) and between \(I\) and \(D\) in both directions, and from \(D\) to \(I\). Formulas of the prevalences of \(I\) and \(D\) at the epidemiological equilibrium are derived. The model is qualibrated by using information on
Here the sub-model aims at computing the efficacy of the prophylactic treatment policy by accounting for
Treatment uptake: in absence of detailed information of people’s behaviour related to treatment, we phenomenologically model the probability of treatment uptake as a function of the proposed duration of the proposed prophylactic treatment. For that, we use a 3-parameter Hill equation that provides a S-curve with great shape flexibility (Figure 4).
Treatment adherence and effective treatment duration: any person taking the treatment can drop it at any time. We thus model the treatment adherence as a function of the treatment duration in order to compute the effective duration of treatment that, accounting for drop-outs, will be lower than the proposed treatment duration. To do so, we consider that any given day in the treatment the probability of dropping it increases with the number of days since the start of the treatment (reflecting a “fatigue” of taking the treatment) and, at the same time, decreases with the number of days left in the treatment (reflecting the fact that somebody is not likely to give up that close to the “finish line”). (An alternative to this second probability could be to express the time left in the treatment as a proportion of the total duration of the treatment.) Each of these 2 effects is modelled by a 3-parameter Hill equation as described above and, from these two probabilities combined, we derive the probability distribution of the effective treatment duration among the people taking the treatment (Figure 5). This probability distribution depends on the proposed duration of the treatment. From here we can either computer the mean of the distribution for the rest of the analysis or, instead, propagate this distribution through the rest of the analysis.
Treatment efficacy: the last piece of the puzzle consists in modelling the treatment efficacy as a function of its effective duration and here again we make use of a 3-parameter Hill equation to do so (Figure 6).
From the above section we have the efficacy of the prophylactic treatment policy as a function of its proposed duration. Next step consists in integrating this efficacy into the formula of the TB prevalence by converting this efficacy into a rate accounting for the competing risks as defined by the epidemiological model.Once this is done, we can express the prevalence of TB directly as a function of the proposed duration of the proposed prophylactic treatment and then look whether there is a value of the duration of the duration of the prophylactic treatment that minimizes the TB prevalence (Figure 7). As mentioned above, this can be done by considering either the mean effective duration of treatment, or the whole distribution of the effective duration of treatment.
The package used in this analysis:
library(tibble)
library(deSolve)
library(purrr)
library(dplyr)
library(parallel)
Tuning and defining some utilitary functions.
lwd_val <- 2
seq2 <- function(...) seq(..., le = 512)
plot2 <- function(..., col = 4) plot(..., type = "l", lwd = lwd_val, col = col)
plot3 <- function(...) plot2(..., xlab = "time (years)")
plot4 <- function(...) plot2(..., xlab = "actual duration of treatment (days)")
lines2 <- function(..., col = 2) lines(..., lwd = lwd_val, col = col)
legend2 <- function(...) legend(..., lty = 1, lwd = lwd_val, bty = "n")
abline2 <- function(..., col = 2) abline(..., lwd = lwd_val, col = col)
mclapply2 <- function(...)
parallel::mclapply(..., mc.cores = parallel::detectCores() - 1)
A few utilitary functions:
polygon2 <- function(x, y, col = 4, alpha = .2, ...) {
polygon(c(x[1], x, tail(x, 1)), c(0, y, 0), col = adjustcolor(col, alpha),
border = NA)
}
A function that will draw from the probability distribution:
draw <- function(probs, n) {
rep(seq_along(probs), as.vector(rmultinom(1, size = n, prob = probs)))
}
get_val <- function(ind, vect) vect[ind]
Figure 1: flow chart of the epidemiological model in which people can be in 3 clinical status: non-infected and susceptible (\(S\)), infected but neither sick nor infectious (\(I\)) and diseased people (\(D\)).
\[ \frac{dS}{dt} = (1-q)\delta D + (\gamma + \pi(d)) I - \beta\frac{D}{N}S \]
\[ \frac{dI}{dt} = \beta\frac{D}{N}S + q\delta D - (\sigma + \gamma + \pi(d)) I \]
\[ \frac{dD}{dt} = \sigma I - \delta D \]
\[ N = S + I + D \]
\[ D^* = \frac{\beta - \left(1 - q + \frac{\gamma + \pi}{\sigma}\right)\delta} {\left(1 + \frac{\delta}{\sigma}\right)\beta}N \]
\[ I^* = \frac{\delta}{\sigma}D^* \]
\[ S^* = N - I^* - D^* \]
The model in R:
model <- function(S0, I0, D0, beta, sigma, gamma, delta, q, pi, times) {
N <- S0 + I0 + D0
c(S = S0, I = I0, D = D0) |>
ode(times,
function(time, state, pars) {
with(as.list(c(state, pars)), {
infections <- beta * D * S / N
dS <- (1 - q) * delta * D + (gamma + pi) * I - infections
dI <- infections + q * delta * D - (sigma + gamma + pi) * I
dD <- sigma * I - delta * D
list(c(dS, dI, dD))
})},
c(beta = beta, sigma = sigma, gamma = gamma, delta = delta, q = q, pi = pi)) |>
as.data.frame() |>
as_tibble()
}
The equilibrium values in R:
d_star <- function(gamma, sigma, delta, beta, q, pi, N) {
N * (beta - delta * (1 - q + (gamma + pi) / sigma)) /
(beta * (1 + delta / sigma))
}
i_star <- function(gamma, sigma, delta, beta, q, pi, N) {
delta * d_star(gamma, sigma, delta, beta, q, pi, N) / sigma
}
s_star <- function(gamma, sigma, delta, beta, q, pi, N) {
N - i_star(gamma, sigma, delta, beta, q, pi, N) -
d_star(gamma, sigma, delta, beta, q, pi, N)
}
Let’s compare:
S0 <- 1e6 - 10 # ind
I0 <- 0 # ind
D0 <- 10 # ind
beta <- 1 # /year/ind
sigma <- 1.15 # /year
gamma <- .1 # /year
delta <- .1 # /year
q <- .5
pi <- 0 # in absence of prophylaxis
sims <- model(S0, I0, D0, beta, sigma, gamma, delta, q, pi, seq2(0, 50))
N <- S0 + I0 + D0
opar <- par(mfrow = c(1, 3))
with(sims, plot3(time, S, ylab = "susceptibles S"))
abline2(h = s_star(gamma, sigma, delta, beta, q, pi, N))
with(sims, plot3(time, I, ylab = "infected non-diseased I"))
abline2(h = i_star(gamma, sigma, delta, beta, q, pi, N))
with(sims, plot3(time, D, ylab = "diseased D"))
abline2(h = d_star(gamma, sigma, delta, beta, q, pi, N))
par(opar)
Figure 2: numerical verification of the formulas for the equilibrium values of \(S\), \(I\) and \(D\): numerical simulation in blue, values from the formulas in red.
If \(p\%\) of those \(I\) who will become \(D\) within \(E\) years, then the \(\sigma\) rate should read
\[ \sigma = -\frac{\log(1 - p)}{E} \]
If \(x\) (typically \(3\) to \(5\%\)) is the proportion of \(I\) that will ultimately move to \(D\), then, we have
\[ \frac{\sigma}{\sigma + \gamma} = x \]
which leads to this expression for \(\gamma\):
\[ \gamma = \frac{1 - x}{x}\sigma \]
We are left with two parameters to estimate (\(\beta\) and \(\delta\)) that we can estimate from observed values of \(I^*\) (between 10 and 40%) and \(D^*\) (between 0.15 and 0.30%). First \(\delta\):
\[ \delta = \sigma\frac{I^*}{D^*} \]
And \(\beta\):
\[ \beta = \frac{(1 - q)\delta D^* + \gamma I^*}{(N - I^* - D^*)D^*}N \] In R:
parameters_values <- function(
p = .9, E = 2, # p% of people developing disease doing so within E years
x = .04, # proportion of I that will become D (3 to 5%)
I = .25, # prevalence of I (between 10 and 40%)
D = 450 / 100000, # prevalence of D (between 150 and 300 / 100,000)
q = .15) { # proportion of "recovered" D that actually go to I instead of S
sigma <- - log(1 - p) / E
gamma <- sigma * (1 - x) / x
delta <- sigma * I / D
beta <- ((1 - q) * delta * D + gamma * I) / ((1 - I - D) * D)
c(beta = beta, gamma = gamma, sigma = sigma, delta = delta)
}
parameters_values()
## beta gamma sigma delta
## 2132.023234 27.631021 1.151293 63.960697
q <- .15
D0 <- 10 # ind
S0 <- 1e6 - D0 # ind
I0 <- 0 # ind
N <- S0 + I0 + D0
parms <- parameters_values(q = q)
beta <- parms["beta"] # /year/ind
sigma <- parms["sigma"] # /year
gamma <- parms["gamma"] # /year
delta <- parms["delta"] # /year
sims <- model(S0, I0, D0, beta, sigma, gamma, delta, q, 0, seq2(0, 5))
opar <- par(mfrow = c(1, 3))
with(sims, plot3(time, S, ylab = "susceptibles S"))
abline2(h = s_star(gamma, sigma, delta, beta, q, pi, N))
with(sims, plot3(time, I, ylab = "infected non-diseased I"))
abline2(h = i_star(gamma, sigma, delta, beta, q, pi, N))
with(sims, plot3(time, D, ylab = "diseased D"))
abline2(h = d_star(gamma, sigma, delta, beta, q, pi, N))
par(opar)
Figure 3: same as Figure 2 but with realistic values of the parameters, verifying that the prevalences of \(I\) and \(D\) fit the expected values.
In what follows we will model many phenomena phenomenologically using the 3-parameter monotonically increasing Hill equation, the general equation of which is
\[ y = Y\frac{x^h}{X^h + x^h} \] where \(Y\) is the maximum value that \(y\) can take, \(X\) is the value of \(x\) at which \(y\) reaches half of its maximum value \(Y\) and \(h\) is the Hill coefficient that controls the shape of the relationship with S-shape for \(0 < h < 1\) and a simple saturating shape when \(h \ge 1\).
hill <- function(x, X, Y, h) {
x2h <- x^h
Y * x2h / (X^h + x2h)
}
Let’s illustrate the properties of this equation:
xs <- seq2(0, 40)
X <- 15
Y <- .8
plot(NA, xlim = c(0, max(xs)), ylim = c(0, Y),
xlab = "x values", ylab = "y values")
hs <- exp(seq(-3, 3))
abline2(v = X, col = "grey")
abline2(h = Y, col = "grey")
walk2(hs, rev(RColorBrewer::brewer.pal(n = length(hs), "Spectral")),
~ lines2(xs, hill(xs, X, Y, h = .x), col = .y))
We can model the probability of treatment uptake as a function of the treatment duration using a Hill equation as so:
ds <- seq2(0, 30)
plot2(ds, 1 - hill(ds, 15, 1, 4), ylim = 0:1,
xlab = "duration of treatment (days)", ylab = "uptake probability")
Figure 4: modelled probability of treatment uptake as a function of proposed treatment duration.
Let’s assume that the probability \(f\) that somebody stops his/her treatment increases with the number of days s/he’s been taking the treatment according to a Hill equation, reflecting some fatigue effect of being in the treatment. Let’s further assume that this probability is gets mitigated the more we get close to the end of the treatment, reflecting the effect that a person is not likely to drop a treatment anymore when s/he gets close to the end of the treatment. This mitigation effect \(m \in [0,1]\) is modelled by with a Hill equation as well. The probability \(p(t)\) that somebody on his/her day \(t\) of treatment drops it then reads
\[ p(t) = m(t) \times f(t) \]
From this we can express the probability that somebody stops his/her treatment at time \(t\) as
\[ P(t) = p(t)\prod_{x=0}^{{}^{-}t}(1 - p(x))^{dx} \]
The density of probability of the actual duration of treatment (aka effective durations of treatment) can then be expressed by:
\[ \varphi(t, d) = \frac{P(t)}{\int_{0}^d P(x)dx} \]
where \(d\) is the proposed duration of treatment.
Let’s look at some examples:
probabilities <- function(
pd = 40, # propsed duration of treatment
Xf = 25, Yf = .8, hf = 7, # fatigue
Xmp = .875, Ym = 1, hm = 50, # mitigation of fatigue
lwd = 2,
ylim = 0:1,
add = FALSE,
lgnd = TRUE) { # whether to add the legend to the plot or not
Xm <- Xmp * pd
ts <- seq2(0, pd)
fatigue <- hill(ts, Xf, Yf, hf)
mitigation <- 1 - hill(ts, Xm, Ym, hm)
if (! add) plot(ts, fatigue, ylim = ylim, col = 3, type = "l", lwd = lwd,
xlab = "time in treatment (days)",
ylab = "probability or mitigation factor")
lines2(ts, mitigation)
lines2(ts, mitigation * fatigue, col = 4)
if (lgnd & ! add)
legend2("topleft", col = c(3, 2, 4),
legend = c("fatigue", "mitigation of fatigue", "dropping probability"))
}
Executing the function:
probabilities()
Let’s now work out the probability distribution of the effective durations of treatment:
dropping_probability <- function(
pd = 40, # proposed duration of treatment
Xf = 25, Yf = .8, hf = 7, # fatigue
Xmp = .875, Ym = 1, hm = 50, # mitigation of fatigue
by = pd / (le - 1), le = 512) { # sampling day in the treatment
ts <- seq(0, pd, by)
tibble(x = ts,
y = hill(ts, Xf, Yf, hf) * (1 - hill(ts, Xmp * pd, Ym, hm)))
}
effective_duration <- function(
pd = 40, # proposed duration of treatment
Xf = 25, Yf = .8, hf = 7, # fatigue
Xmp = .875, Ym = 1, hm = 50, # mitigation of fatigue
by = pd / (le - 1), le = 512) { # sampling effective durations of treatment
dp <- dropping_probability(pd, Xf, Yf, hf, Xmp, Ym, hm, by, le)
y <- dp$y
dens <- c(y[1], y[-1] * cumprod(1 - y[-length(y)]))
tibble(x = dp$x,
y = dens / sum(dens))
}
Let’s try it:
with(effective_duration(), {
plot4(x, y, ylab = "probability density")
polygon2(x, y)
})
explore_dropping <- function(
Xf = 100,
Yf = .8,
hf = 7,
Xmp = .875,
Ym = 1,
hm = 50,
by = .1,
ylim = c(0, .2),
proposed_durations = c(20, 40, seq(60, 100, 10))) {
proposed_durations <- sort(proposed_durations)
eds <- map(proposed_durations,
~ effective_duration(.x, Xf, Yf, hf, Xmp, Ym, hm, by))
opar <- par(mfrow = 2:1)
probabilities(tail(proposed_durations, 1), Xf, Yf, hf, Xmp, Ym, hm, lwd = 4,
ylim = ylim, lgnd = FALSE)
walk(head(proposed_durations, -1),
~ probabilities(.x, Xf, Yf, hf, Xmp, Ym, hm, add = TRUE))
plot(NA,
xlim = c(0, max(proposed_durations)),
ylim = c(0, max(unlist(map(eds, ~ .x$y)))),
xlab = "effective duration of treatment (days)",
ylab = "probability density")
walk(eds, ~ with(.x, {
lines2(.x$x, .x$y, col = 4)
polygon2(x, y)
}))
abline2(v = proposed_durations, col = "grey")
par(opar)
}
explore_dropping()
explore_dropping(
Xf = 200,
Yf = .15,
hf = 7,
Xmp = .875,
Ym = 1,
hm = 50,
by = .1,
ylim = c(0, .0005),
proposed_durations = c(20, 40, seq(60, 100, 10)))
explore_dropping(
Xf = 150,
Yf = .05,
hf = 7,
Xmp = .875,
Ym = 1,
hm = 50,
by = .1,
ylim = c(0, .001),
proposed_durations = c(20, 40, seq(60, 100, 10)))
We can assume that the treatment efficiency as a function of the actual duration of treatment also follows a Hill equation:
ts <- seq2(0, 30)
plot4(ts, hill(ts, 10, 1, 1), ylim = 0:1, ylab = "treatment efficacy")
Figure 6: modelled treatment efficacy as a function of the effective treatment duration.
By putting everything together, we can express the density of probability of the efficacy of the prophylactic treatment policy as
\[ \chi(d) = \tau \times \upsilon(d) \times \int_0^d \varepsilon(x)\varphi(x, d) dx \]
where
From this, it comes that the rate \(\pi(d)\) can be expressed as
\[ \pi(d) = \frac{\chi(d)}{1 - \chi(d)}(\sigma + \gamma) \]
This expression of \(\pi(d)\) can be fed into the expression of \(D*\) to get the effect of the duration of the proposed prophylactic treatment on the number \(D*\) of people with TB:
\[ D^*(d) = \frac{\beta - (1 - q + (\gamma + \pi(d)) / \sigma)\delta} {(1 + \delta / \sigma)\beta}N \]
duration2incidence <- function(
ed, # effective duration
pd = 40, # proposed treatment duration
tau = .45, # contact tracing efficacy (30-60%)
Xu = 15, Yu = 1, hu = 4, # treatment uptake
Xe = 10, Ye = 1, he = 1, # treatment efficacy
gamma, sigma, delta, beta, # epidemiological parameters
q = .15, # proportion of "recovered" D that actually goes to I instead of S
N = 1e5) { # population size
chi <- tau * (1 - hill(pd, Xu, Yu, hu)) * hill(ed, Xe, Ye, he)
d_star(gamma, sigma, delta, beta, q, chi * (sigma + gamma) / (1 - chi), N)
}
tb_cases <- function(
pd = 40, # proposed treatment duration
tau = .45, # contact tracing efficacy (30 to 60%)
Xu = 15, Yu = 1, hu = 4, # treatment uptake
Xf = 25, Yf = .8, hf = 7, # fatigue effect
Xmp = .875, Ym = 1, hm = 50, # mitigation effect
Xe = 10, Ye = 1, he = 1, # treatment efficacy
p = .9, E = 2, # p% of people developing disease doing so within E years
x = .04, # proportion of I that will become D (3 to 5%)
I = .25, # prevalence of I (between 10 and 40%)
D = 450 / 200000, # prevalence of D (150 to 300 / 100,000)
q = .15, # proportion of "recovered" D that actually goes to I instead of S
N = 1e5, # population size
n = 1e5, # number of draws to sample the distribution of the effective durations
by = pd / (le - 1), le = 1e4) { # integration
parms <- parameters_values(p, E, x, I, D, q)
sigma <- parms["sigma"]
gamma <- parms["gamma"]
ts <- seq(0, pd, by)
drop_prob <- hill(ts, Xf, Yf, hf) * (1 - hill(ts, Xmp * pd, Ym, hm))
draws <- c(drop_prob[1],
drop_prob[-1] * cumprod(1 - drop_prob[-length(drop_prob)])) |>
draw(n) |>
table()
draws |>
names() |>
as.numeric() |>
get_val(ts) |>
map_dbl(~ duration2incidence(.x, pd, tau, Xu, Yu, hu, Xe, Ye, he, gamma,
sigma, parms["delta"], parms["beta"], q, N)) |>
rep(draws) |>
mean()
}
xs <- seq2(.001, 100)
tau_sensitivity <- function(tau_val) {
map_dbl(xs, ~ tb_cases(.x, tau = tau_val))
}
ys <- mclapply2(seq(.3, .6, .03), tau_sensitivity)
plot(NA, xlim = c(0, max(xs)), ylim = c(0, max(unlist(ys))),
xlab = "proposed duration of prophylactic treatment (days)",
ylab = "TB prevalence (/100,000)")
walk2(ys, rev(RColorBrewer::brewer.pal(n = length(ys), "Spectral")),
~ lines2(xs, .x, col = .y))
Figure 7: tuberculosis prevalence as a function of proposed duration of prophylactic treatment.
maxval <- max(unlist(ys))
plot2(seq(.3, .6, .03), (maxval - map_dbl(ys, min)) / maxval)
xs2 <- seq(0, 1, le = 15)
ys2 <- mclapply2(seq(0, 1, le = 15), tau_sensitivity)
maxval <- max(unlist(ys2))
plot2(xs2, (maxval - map_dbl(ys2, min)) / maxval)
plot(NA, xlim = c(0, max(xs)), ylim = c(0, max(unlist(ys2))),
xlab = "proposed duration of prophylactic treatment (days)",
ylab = "TB prevalence (/100,000)")
walk(ys2, ~ lines2(xs, .x))