1 Overview

1.1 Context and question

The goal is to reduce TB prevalence in the population by proposing a prophylactic treatment of a given duration to people recently exposed to a confirmed TB case. The shorter the proposed duration, the lower the efficacy of the treatment but, also – we hypothesize –, the higher the uptake rate. The question then is whether a higher uptake rate can epidemiologically (over-)compensate at the population level a lower efficacy of the treatment at the individual level. If yes, then we are also interested in knowing what is the proposed duration of the treatment that would lead to the lowest TB prevalence in the population.

1.2 Method

The model contains 2 sub-models. The first one is an epidemiological model of TB transmission in the population from which we derive a formula of the prevalence of TB in the population. The second one is a model of prophylactic treatment policy that accounts for

  • contact tracing efficacy;
  • treatment uptake as a function of proposed treatment duration;
  • the effective treatment duration as a function of the proposed treatment duration;
  • treatment efficacy as a function of the effective treatment duration.

The distribution of the effective treatment duration as a function of the proposed treatment duration is itself derived from assumptions made on how treatment adherence depends on the proposed treatment duration.

The efficacy of the prophylactic treatment policy is then integrated into the formula of the TB prevalence in order to derive the relationship between the proposed duration of the prophylactic treatment and the TB prevalance in the population.

1.2.1 Epidemiological sub-model

We develop a simple 5-parameters epidemiological model of TB that considers 3 clinical status (Figure 1):

  • susceptible (\(S\));
  • infected but neither sick nor infectious (\(I\));
  • sick and infectious (\(D\)).

In this model people can transit between \(S\) and \(I\) and between \(I\) and \(D\) in both directions, and from \(D\) to \(I\). Formulas of the prevalences of \(I\) and \(D\) at the epidemiological equilibrium are derived. The model is qualibrated by using information on

  • the prevalence of \(I\);
  • the prevalence of \(D\);
  • the proportion of infected that will ultimately develop the disease;
  • the proportion of people developing the disease that do so within a given time;
  • the proportion of relapse.

1.2.2 Prophylactic treatment sub-model

Here the sub-model aims at computing the efficacy of the prophylactic treatment policy by accounting for

  • contact tracing efficacy;
  • treatment uptake;
  • treatment adherence that determines the effective treatment duration upon uptake;
  • treatment efficacy that depends on the effective treatment duration.

Treatment uptake: in absence of detailed information of people’s behaviour related to treatment, we phenomenologically model the probability of treatment uptake as a function of the proposed duration of the proposed prophylactic treatment. For that, we use a 3-parameter Hill equation that provides a S-curve with great shape flexibility (Figure 4).

Treatment adherence and effective treatment duration: any person taking the treatment can drop it at any time. We thus model the treatment adherence as a function of the treatment duration in order to compute the effective duration of treatment that, accounting for drop-outs, will be lower than the proposed treatment duration. To do so, we consider that any given day in the treatment the probability of dropping it increases with the number of days since the start of the treatment (reflecting a “fatigue” of taking the treatment) and, at the same time, decreases with the number of days left in the treatment (reflecting the fact that somebody is not likely to give up that close to the “finish line”). (An alternative to this second probability could be to express the time left in the treatment as a proportion of the total duration of the treatment.) Each of these 2 effects is modelled by a 3-parameter Hill equation as described above and, from these two probabilities combined, we derive the probability distribution of the effective treatment duration among the people taking the treatment (Figure 5). This probability distribution depends on the proposed duration of the treatment. From here we can either computer the mean of the distribution for the rest of the analysis or, instead, propagate this distribution through the rest of the analysis.

Treatment efficacy: the last piece of the puzzle consists in modelling the treatment efficacy as a function of its effective duration and here again we make use of a 3-parameter Hill equation to do so (Figure 6).

1.2.3 TB prevalence as a function of proposed treatment duration

From the above section we have the efficacy of the prophylactic treatment policy as a function of its proposed duration. Next step consists in integrating this efficacy into the formula of the TB prevalence by converting this efficacy into a rate accounting for the competing risks as defined by the epidemiological model.Once this is done, we can express the prevalence of TB directly as a function of the proposed duration of the proposed prophylactic treatment and then look whether there is a value of the duration of the duration of the prophylactic treatment that minimizes the TB prevalence (Figure 7). As mentioned above, this can be done by considering either the mean effective duration of treatment, or the whole distribution of the effective duration of treatment.

2 Packages

The package used in this analysis:

library(tibble)
library(deSolve)
library(purrr)
library(dplyr)
library(parallel)

3 Functions

Tuning and defining some utilitary functions.

lwd_val <- 2

seq2 <- function(...) seq(..., le = 512)

plot2 <- function(..., col = 4) plot(..., type = "l", lwd = lwd_val, col = col)

plot3 <- function(...) plot2(..., xlab = "time (years)")

plot4 <- function(...) plot2(..., xlab = "actual duration of treatment (days)")

lines2 <- function(..., col = 2) lines(..., lwd = lwd_val, col = col)

legend2 <- function(...) legend(..., lty = 1, lwd = lwd_val, bty = "n")

abline2 <- function(..., col = 2) abline(..., lwd = lwd_val, col = col)

mclapply2 <- function(...)
  parallel::mclapply(..., mc.cores = parallel::detectCores() - 1) 

A few utilitary functions:

polygon2 <- function(x, y, col = 4, alpha = .2, ...) {
  polygon(c(x[1], x, tail(x, 1)), c(0, y, 0), col = adjustcolor(col, alpha),
          border = NA)
}

A function that will draw from the probability distribution:

draw <- function(probs, n) {
  rep(seq_along(probs), as.vector(rmultinom(1, size = n, prob = probs)))
}
get_val <- function(ind, vect) vect[ind]

4 Epidemiological model

4.1 Assumptions

  • constant population size
  • frequency-dependence transmission
  • no mortality
  • no immunity against TB

4.2 Epidemiological framework

Figure 1: flow chart of the epidemiological model in which people can be in 3 clinical status: non-infected and susceptible (\(S\)), infected but neither sick nor infectious (\(I\)) and diseased people (\(D\)).

4.2.1 Parameters

  • \(N\): population size (ind)
  • \(\beta\): per capita infectious contact rate (/year/ind)
  • \(\gamma\): per capita rate of clearance of non-diseased, either naturally or from whatever treatment policy currently in place (/year)
  • \(\delta\): per capita rate of clearance of diseased, either naturally or from whatever treatment policy currently in place (/year)
  • \(\sigma\): per capita rate of developing disease once infected (/year)
  • \(q\): proportion of treated diseased that do not clear the bacillus
  • \(\pi(d)\): per capita rate of clearance of non-diseased, due to prophylactic treatment (/year). Depends the on duration \(d\) of prophylactic treatment (see below)

4.2.2 Epidemiological dynamics:

\[ \frac{dS}{dt} = (1-q)\delta D + (\gamma + \pi(d)) I - \beta\frac{D}{N}S \]

\[ \frac{dI}{dt} = \beta\frac{D}{N}S + q\delta D - (\sigma + \gamma + \pi(d)) I \]

\[ \frac{dD}{dt} = \sigma I - \delta D \]

4.2.3 Constant population size:

\[ N = S + I + D \]

4.2.4 Equilibrium

\[ D^* = \frac{\beta - \left(1 - q + \frac{\gamma + \pi}{\sigma}\right)\delta} {\left(1 + \frac{\delta}{\sigma}\right)\beta}N \]

\[ I^* = \frac{\delta}{\sigma}D^* \]

\[ S^* = N - I^* - D^* \]

4.2.5 Numerical verification

The model in R:

model <- function(S0, I0, D0, beta, sigma, gamma, delta, q, pi, times) {
  N <- S0 + I0 + D0
  c(S = S0, I = I0, D = D0) |> 
    ode(times,
        function(time, state, pars) {
          with(as.list(c(state, pars)), {
            infections <- beta * D * S / N
            dS <- (1 - q) * delta * D + (gamma + pi) * I - infections
            dI <- infections + q * delta * D - (sigma + gamma + pi) * I
            dD <- sigma * I - delta * D
            list(c(dS, dI, dD))
          })},
        c(beta = beta, sigma = sigma, gamma = gamma, delta = delta, q = q, pi = pi)) |>
    as.data.frame() |> 
    as_tibble()
}

The equilibrium values in R:

d_star <- function(gamma, sigma, delta, beta, q, pi, N) {
  N * (beta - delta * (1 - q + (gamma + pi) / sigma)) /
    (beta * (1 + delta / sigma))
}

i_star <- function(gamma, sigma, delta, beta, q, pi, N) {
  delta * d_star(gamma, sigma, delta, beta, q, pi, N) / sigma
}

s_star <- function(gamma, sigma, delta, beta, q, pi, N) {
  N - i_star(gamma, sigma, delta, beta, q, pi, N) -
    d_star(gamma, sigma, delta, beta, q, pi, N)
}

Let’s compare:

S0 <- 1e6 - 10 # ind
I0 <- 0 # ind
D0 <- 10 # ind
beta <- 1 # /year/ind
sigma <- 1.15 # /year
gamma <- .1 # /year
delta <- .1 # /year
q <- .5
pi <- 0 # in absence of prophylaxis

sims <- model(S0, I0, D0, beta, sigma, gamma, delta, q, pi, seq2(0, 50))

N <- S0 + I0 + D0

opar <- par(mfrow = c(1, 3))
with(sims, plot3(time, S, ylab = "susceptibles S"))
abline2(h = s_star(gamma, sigma, delta, beta, q, pi, N))
with(sims, plot3(time, I, ylab = "infected non-diseased I"))
abline2(h = i_star(gamma, sigma, delta, beta, q, pi, N))
with(sims, plot3(time, D, ylab = "diseased D"))
abline2(h = d_star(gamma, sigma, delta, beta, q, pi, N))

par(opar)

Figure 2: numerical verification of the formulas for the equilibrium values of \(S\), \(I\) and \(D\): numerical simulation in blue, values from the formulas in red.

4.2.6 Estimating / assessing parameters

If \(p\%\) of those \(I\) who will become \(D\) within \(E\) years, then the \(\sigma\) rate should read

\[ \sigma = -\frac{\log(1 - p)}{E} \]

If \(x\) (typically \(3\) to \(5\%\)) is the proportion of \(I\) that will ultimately move to \(D\), then, we have

\[ \frac{\sigma}{\sigma + \gamma} = x \]

which leads to this expression for \(\gamma\):

\[ \gamma = \frac{1 - x}{x}\sigma \]

We are left with two parameters to estimate (\(\beta\) and \(\delta\)) that we can estimate from observed values of \(I^*\) (between 10 and 40%) and \(D^*\) (between 0.15 and 0.30%). First \(\delta\):

\[ \delta = \sigma\frac{I^*}{D^*} \]

And \(\beta\):

\[ \beta = \frac{(1 - q)\delta D^* + \gamma I^*}{(N - I^* - D^*)D^*}N \] In R:

parameters_values <- function(
    p = .9, E = 2, # p% of people developing disease doing so within E years
    x = .04, # proportion of I that will become D (3 to 5%)
    I = .25, # prevalence of I (between 10 and 40%)
    D = 450 / 100000, # prevalence of D (between 150 and 300 / 100,000)
    q = .15) { # proportion of "recovered" D that actually go to I instead of S
  
  sigma <- - log(1 - p) / E
  gamma <- sigma * (1 - x) / x
  delta <- sigma * I / D
  beta <- ((1 - q) * delta * D + gamma * I) / ((1 - I - D) * D)
  
  c(beta = beta, gamma = gamma, sigma = sigma, delta = delta)
}
parameters_values()
##        beta       gamma       sigma       delta 
## 2132.023234   27.631021    1.151293   63.960697
q <- .15
D0 <- 10 # ind
S0 <- 1e6 - D0 # ind
I0 <- 0 # ind

N <- S0 + I0 + D0
parms <- parameters_values(q = q)
beta  <- parms["beta"] # /year/ind
sigma <- parms["sigma"] # /year
gamma <- parms["gamma"] # /year
delta <- parms["delta"] # /year

sims <- model(S0, I0, D0, beta, sigma, gamma, delta, q, 0, seq2(0, 5))

opar <- par(mfrow = c(1, 3))
with(sims, plot3(time, S, ylab = "susceptibles S"))
abline2(h = s_star(gamma, sigma, delta, beta, q, pi, N))
with(sims, plot3(time, I, ylab = "infected non-diseased I"))
abline2(h = i_star(gamma, sigma, delta, beta, q, pi, N))
with(sims, plot3(time, D, ylab = "diseased D"))
abline2(h = d_star(gamma, sigma, delta, beta, q, pi, N))

par(opar)

Figure 3: same as Figure 2 but with realistic values of the parameters, verifying that the prevalences of \(I\) and \(D\) fit the expected values.

5 Prophylaxis model

5.1 The Hill equation

In what follows we will model many phenomena phenomenologically using the 3-parameter monotonically increasing Hill equation, the general equation of which is

\[ y = Y\frac{x^h}{X^h + x^h} \] where \(Y\) is the maximum value that \(y\) can take, \(X\) is the value of \(x\) at which \(y\) reaches half of its maximum value \(Y\) and \(h\) is the Hill coefficient that controls the shape of the relationship with S-shape for \(0 < h < 1\) and a simple saturating shape when \(h \ge 1\).

hill <- function(x, X, Y, h) {
  x2h <- x^h
  Y * x2h / (X^h + x2h)
}

Let’s illustrate the properties of this equation:

xs <- seq2(0, 40)
X <- 15
Y <- .8

plot(NA, xlim = c(0, max(xs)), ylim = c(0, Y),
     xlab = "x values", ylab = "y values")
hs <- exp(seq(-3, 3))
abline2(v = X, col = "grey")
abline2(h = Y, col = "grey")
walk2(hs, rev(RColorBrewer::brewer.pal(n = length(hs), "Spectral")),
      ~ lines2(xs, hill(xs, X, Y, h = .x), col = .y))

5.2 Treatment uptake

We can model the probability of treatment uptake as a function of the treatment duration using a Hill equation as so:

ds <- seq2(0, 30)
plot2(ds, 1 - hill(ds, 15, 1, 4), ylim = 0:1,
      xlab = "duration of treatment (days)", ylab = "uptake probability")

Figure 4: modelled probability of treatment uptake as a function of proposed treatment duration.

5.3 Treatment adherence and effective duration of treatment

Let’s assume that the probability \(f\) that somebody stops his/her treatment increases with the number of days s/he’s been taking the treatment according to a Hill equation, reflecting some fatigue effect of being in the treatment. Let’s further assume that this probability is gets mitigated the more we get close to the end of the treatment, reflecting the effect that a person is not likely to drop a treatment anymore when s/he gets close to the end of the treatment. This mitigation effect \(m \in [0,1]\) is modelled by with a Hill equation as well. The probability \(p(t)\) that somebody on his/her day \(t\) of treatment drops it then reads

\[ p(t) = m(t) \times f(t) \]

From this we can express the probability that somebody stops his/her treatment at time \(t\) as

\[ P(t) = p(t)\prod_{x=0}^{{}^{-}t}(1 - p(x))^{dx} \]

The density of probability of the actual duration of treatment (aka effective durations of treatment) can then be expressed by:

\[ \varphi(t, d) = \frac{P(t)}{\int_{0}^d P(x)dx} \]

where \(d\) is the proposed duration of treatment.

Let’s look at some examples:

probabilities <- function(
    pd = 40, # propsed duration of treatment
    Xf = 25, Yf = .8, hf = 7, # fatigue
    Xmp = .875, Ym = 1, hm = 50, # mitigation of fatigue
    lwd = 2,
    ylim = 0:1,
    add = FALSE,
    lgnd = TRUE) { # whether to add the legend to the plot or not
  
  Xm <- Xmp * pd
  ts <- seq2(0, pd)

  fatigue <- hill(ts, Xf, Yf, hf)
  mitigation <- 1 - hill(ts, Xm, Ym, hm)
  
  if (! add) plot(ts, fatigue, ylim = ylim, col = 3, type = "l", lwd = lwd,
                  xlab = "time in treatment (days)",
                  ylab = "probability or mitigation factor")
  lines2(ts, mitigation)
  lines2(ts, mitigation * fatigue, col = 4)
  
  if (lgnd & ! add)
    legend2("topleft", col = c(3, 2, 4),
            legend = c("fatigue", "mitigation of fatigue", "dropping probability"))
}

Executing the function:

probabilities()

Let’s now work out the probability distribution of the effective durations of treatment:

dropping_probability <- function(
    pd = 40, # proposed duration of treatment
    Xf = 25, Yf = .8, hf = 7, # fatigue
    Xmp = .875, Ym = 1, hm = 50, # mitigation of fatigue
    by = pd / (le - 1), le = 512) { # sampling day in the treatment

  ts <- seq(0, pd, by)
  tibble(x = ts,
         y = hill(ts, Xf, Yf, hf) * (1 - hill(ts, Xmp * pd, Ym, hm)))
}
effective_duration <- function(
    pd = 40, # proposed duration of treatment
    Xf = 25, Yf = .8, hf = 7, # fatigue
    Xmp = .875, Ym = 1, hm = 50, # mitigation of fatigue
    by = pd / (le - 1), le = 512) { # sampling effective durations of treatment
  
  dp <- dropping_probability(pd, Xf, Yf, hf, Xmp, Ym, hm, by, le)
  y <- dp$y
  dens <- c(y[1], y[-1] * cumprod(1 - y[-length(y)]))
  tibble(x = dp$x,
         y = dens / sum(dens))
}

Let’s try it:

with(effective_duration(), {
  plot4(x, y, ylab = "probability density")
  polygon2(x, y)
})

explore_dropping <- function(
    Xf = 100,
    Yf = .8,
    hf = 7,
    Xmp = .875,
    Ym = 1,
    hm = 50,
    by = .1,
    ylim = c(0, .2),
    proposed_durations = c(20, 40, seq(60, 100, 10))) {

  proposed_durations <- sort(proposed_durations)

  eds <- map(proposed_durations,
             ~ effective_duration(.x, Xf, Yf, hf, Xmp, Ym, hm, by))
  
  opar <- par(mfrow = 2:1)
  
  probabilities(tail(proposed_durations, 1), Xf, Yf, hf, Xmp, Ym, hm, lwd = 4,
                ylim = ylim, lgnd = FALSE)
  walk(head(proposed_durations, -1),
       ~ probabilities(.x, Xf, Yf, hf, Xmp, Ym, hm, add = TRUE))
  
  plot(NA,
       xlim = c(0, max(proposed_durations)),
       ylim = c(0, max(unlist(map(eds, ~ .x$y)))),
       xlab = "effective duration of treatment (days)",
       ylab = "probability density")
  walk(eds, ~ with(.x, {
    lines2(.x$x, .x$y, col = 4)
    polygon2(x, y)
  }))
  abline2(v = proposed_durations, col = "grey")
  
  par(opar)
}
explore_dropping()

explore_dropping(
  Xf = 200,
  Yf = .15,
  hf = 7,
  Xmp = .875,
  Ym = 1,
  hm = 50,
  by = .1,
  ylim = c(0, .0005),
  proposed_durations = c(20, 40, seq(60, 100, 10)))

explore_dropping(
  Xf = 150,
  Yf = .05,
  hf = 7,
  Xmp = .875,
  Ym = 1,
  hm = 50,
  by = .1,
  ylim = c(0, .001),
  proposed_durations = c(20, 40, seq(60, 100, 10)))

5.4 Treatment efficacy

We can assume that the treatment efficiency as a function of the actual duration of treatment also follows a Hill equation:

ts <- seq2(0, 30)
plot4(ts, hill(ts, 10, 1, 1), ylim = 0:1, ylab = "treatment efficacy")

Figure 6: modelled treatment efficacy as a function of the effective treatment duration.

5.5 Putting everything together

By putting everything together, we can express the density of probability of the efficacy of the prophylactic treatment policy as

\[ \chi(d) = \tau \times \upsilon(d) \times \int_0^d \varepsilon(x)\varphi(x, d) dx \]

where

  • \(\tau\) is the proportion of infected that can be identified through contact tracing;
  • \(\upsilon(d)\) is the treatment uptake as a function of the proposed treatment duration \(d\);
  • \(\varepsilon(x)\) is the treatment efficacy as a function of an effective treatment duration \(x\);
  • \(\varphi(x, d)\) is the density of probability of an effective treatment duration \(x\), given a proposed duration \(d\).

From this, it comes that the rate \(\pi(d)\) can be expressed as

\[ \pi(d) = \frac{\chi(d)}{1 - \chi(d)}(\sigma + \gamma) \]

This expression of \(\pi(d)\) can be fed into the expression of \(D*\) to get the effect of the duration of the proposed prophylactic treatment on the number \(D*\) of people with TB:

\[ D^*(d) = \frac{\beta - (1 - q + (\gamma + \pi(d)) / \sigma)\delta} {(1 + \delta / \sigma)\beta}N \]

duration2incidence <- function(
    ed, # effective duration
    pd = 40, # proposed treatment duration
    tau = .45, # contact tracing efficacy (30-60%)
    Xu = 15, Yu = 1, hu = 4, # treatment uptake
    Xe = 10, Ye = 1, he = 1, # treatment efficacy
    gamma, sigma, delta, beta, # epidemiological parameters
    q = .15, # proportion of "recovered" D that actually goes to I instead of S
    N = 1e5) {  # population size
  
  chi <- tau * (1 - hill(pd, Xu, Yu, hu)) * hill(ed, Xe, Ye, he)
  d_star(gamma, sigma, delta, beta, q, chi * (sigma + gamma) / (1 - chi), N)
}
tb_cases <- function(
    pd = 40, # proposed treatment duration
    tau = .45, # contact tracing efficacy (30 to 60%)
    Xu = 15, Yu = 1, hu = 4, # treatment uptake
    Xf = 25, Yf = .8, hf = 7, # fatigue effect
    Xmp = .875, Ym = 1, hm = 50, # mitigation effect
    Xe = 10, Ye = 1, he = 1, # treatment efficacy
    p = .9, E = 2, # p% of people developing disease doing so within E years
    x = .04, # proportion of I that will become D (3 to 5%)
    I = .25, # prevalence of I (between 10 and 40%)
    D = 450 / 200000, # prevalence of D (150 to 300 / 100,000)
    q = .15, # proportion of "recovered" D that actually goes to I instead of S
    N = 1e5, # population size
    n = 1e5, # number of draws to sample the distribution of the effective durations
    by = pd / (le - 1), le = 1e4) { # integration
  
  parms <- parameters_values(p, E, x, I, D, q)
  sigma <- parms["sigma"]
  gamma <- parms["gamma"]
  
  ts <- seq(0, pd, by)
  drop_prob <- hill(ts, Xf, Yf, hf) * (1 - hill(ts, Xmp * pd, Ym, hm))

  draws <- c(drop_prob[1],
             drop_prob[-1] * cumprod(1 - drop_prob[-length(drop_prob)])) |>
    draw(n) |> 
    table()
  
  draws |> 
    names() |> 
    as.numeric() |> 
    get_val(ts) |> 
    map_dbl(~ duration2incidence(.x, pd, tau, Xu, Yu, hu, Xe, Ye, he, gamma,
                                 sigma, parms["delta"], parms["beta"], q, N)) |> 
    rep(draws) |> 
    mean()
}
xs <- seq2(.001, 100)

tau_sensitivity <- function(tau_val) {
  map_dbl(xs, ~ tb_cases(.x, tau = tau_val))
}
ys <- mclapply2(seq(.3, .6, .03), tau_sensitivity)
plot(NA, xlim = c(0, max(xs)), ylim = c(0, max(unlist(ys))),
     xlab = "proposed duration of prophylactic treatment (days)",
     ylab = "TB prevalence (/100,000)")

walk2(ys, rev(RColorBrewer::brewer.pal(n = length(ys), "Spectral")),
      ~ lines2(xs, .x, col = .y))

Figure 7: tuberculosis prevalence as a function of proposed duration of prophylactic treatment.

maxval <- max(unlist(ys))
plot2(seq(.3, .6, .03), (maxval - map_dbl(ys, min)) / maxval)

xs2 <- seq(0, 1, le = 15)
ys2 <- mclapply2(seq(0, 1, le = 15), tau_sensitivity)
maxval <- max(unlist(ys2))
plot2(xs2, (maxval - map_dbl(ys2, min)) / maxval)

plot(NA, xlim = c(0, max(xs)), ylim = c(0, max(unlist(ys2))),
     xlab = "proposed duration of prophylactic treatment (days)",
     ylab = "TB prevalence (/100,000)")

walk(ys2, ~ lines2(xs, .x))

Bottom