Background

In Lecture 4 we were concerned with models for disease transmission which incorporate households. Although we looked at two distinct cases with regards to what sort of data are available, in this lab session we will focus on the case where there is only final outcome data available. In other words, the available data consist only of the final number of cases in each household.

The main objective of this lab session is to draw Bayesian inference for the model parameters by implementing a Markov Chain Monte Carlo algorithm. Below, there is a brief description of the model (see the lecture notes for more details).

The model, the data and the likelihood

We assume that a population of \(N\) individuals is partitioned into households. Each individual in the population has, independently, a constant ``risk’’ per unit time of becoming infected from the community. In addition, within a household, the disease spreads according to the mechanism of an SIR model.

Suppose that the data consist of the set of numbers \({\bf n} = \{n(j,k)\}\) where $n(j,k) = $ number of households in which \(j\) out of \(k\) initial susceptibles become infected. Therefore, the likelihood takes the form \[\pi({\bf n}|\alpha, p) = \prod_{j}\prod_{k}q(j,k)^{n(j,k)}\] where \(q(j,k)\) is the probability that in a household of size \(k\) the number of individuals who ever become infected is \(j\).

In the lecture we have derived an expression for the probability \(q(j,k)\) and showed that it depends on two parameters:

  • \(p\) which is the probability that an individual avoids infection from outside the household and
  • \(\alpha\) which is the person-to-person infection rate between individuals within the same household.

Preliminaries

Download the file households.R from the module’s website

The file contains two functions:

# this is a function to compute the final size P(T=k|Y=y) = probability
# that k-y susceptibles become infected with $y$ initial infetctives
# and n-y susceptibles

# m initial susceptibles
# a initial infectives
# alpha infection rate
# the length of the fixed infectious period
compute.conditional.prob <- function(m, j, a, alpha, c){
  if (j==0) {
    res <- (exp(-alpha*m*c))^a
  }
  else {

    part.one <- exp(-alpha*(m-j)*c)^(j+a)
    part.two <- choose(m, j)
    
    sum <- 0;
    for (k in 0:(j-1)) {      
      sum <- sum + choose(m-k, j-k)*compute.conditional.prob(m,k,a,alpha,c)/(exp(-alpha*(m-j)*c)^(k+a))
    }
    res <- part.one*(part.two - sum)
  }
  return(res)
}
# We wish to compute P(T = k), for k = 0, ..., n
# "n" is the size of the household
# "alpha" is the person to person infection rate
# "c" is the length of the (fixed) infectious period
compute.marginal.prob <- function(k, n, c, alpha, p) {  
  prob <- 0;  
  for (y in 0:n) {
    if (k >= y) {
      cond.prob <- compute.conditional.prob(n - y, k - y, y, alpha, c)
    }
    else {
      cond.prob <- 0
    }
    prob <- prob + cond.prob*dbinom(y, n, 1 - p)
  }
  prob
}

In addition, Table 1 presents some Asian influenza epidemic household data taken from Longini & Koopman (1982). There is a community which consists of 42 households each of them of size 3.

cases <- c(0,1,2,3)
households <- c(29,9,2,2)
mat <- matrix(NA, nrow=4, 2)
mat[,1] <- cases
mat[,2] <- households
mat <- as.data.frame(mat)
colnames(mat) <- c("No.of.Cases", "No.of.Households")
mat
##   No.of.Cases No.of.Households
## 1           0               29
## 2           1                9
## 3           2                2
## 4           3                2

Exercises

Exercise 1

Go through each of the functions in the file households.R and make sure that you understand what is going on.

Exercise 2

Write a function in R which computes the log-likelihood of the data given the parameters \(p\) and \(\alpha\) assuming that the infectious period is constant and of length one unit.

# this is a function to compute the loglikelihood for any pair of alpha, p for the given data in the hand out;
loglik <- function(alpha, p) {
  
  29*log(compute.marginal.prob(0,3,1,alpha, p)) +  
    9*log(compute.marginal.prob(1,3,1,alpha, p)) +  
    2*log(compute.marginal.prob(2,3,1,alpha, p)) +  
    2*log(compute.marginal.prob(3,3,1,alpha, p)) 

}

Exercise 3

Suppose that we are interested in drawing Bayesian inference for the model parameters and therefore, we could employ an MCMC algorithm to draw samples from the posterior distribution: \[\pi(p, \alpha|{\bf n}) \propto \pi({\bf n}|\alpha, p) \times \pi(p) \times \pi (\alpha).\] Assume that a priori \[p \sim \mbox{Beta}(\mu_p, \nu_p)\] and \[\alpha \sim \mbox{Gamma}(\mu_{\alpha}, {\nu_{\alpha}}).\]

What values should we choose for the hyper-parameters \(\mu_p, \nu_p, \mu_{\alpha}\) and \(\nu_{\alpha}\) such that we end up with fairly uninformative priors for the model parameters \(p\) and \(\alpha\)?

For \(p\) a Uniform(0,1) prior will reflect our ignorance about this parameter a-priori. A U(0,1) is equivalent to a Beta(1,1) prior and hence we choose \(\mu_p = \nu_p = 1\).

The parameter \(\alpha\) represents the person to person infection rate and since it take positive values we may choose an Exponential distribution as prior with a fairly low rate. Therefore, we may choose something like \(\mu_{\alpha} = 1\) and \(\nu_{\alpha} = 0.001\).

Exercise 4

Write a function in R which draws samples from \(\pi(p, \alpha|{\bf n})\) using MCMC and by adopting uninformative priors for the parameters of interest \(p\) and \(\alpha\).

Hint: In the lecture we discussed two different ways to update the parameter \(p\). Either using an independence sampler or a random walk Metropolis. For the purposes of this exercise, update \(p\) by an independence sampler.

# this is an MCMC algorithm for Asian-Data household model
# iter: number of MCMC iteration
# sigma: the standard deviation for the random walk Metropolis algorithm

mcmc.hh <- function(iter, sigma) {


  # values for the prior hyper-parameters
  mu.p <- 1.0
  nu.p <- 1.0

  mu.alpha <- 1.0  
  nu.alpha <- 10^(-3)


  # initial values for the Markov Chain
  alpha.cur <- 0.5;
  p.cur <- 0.8;
  
  # create a matrix to store the output.
  res <- matrix(NA, nrow = iter, ncol=2);
  res[1,] <- c(alpha.cur, p.cur);


  for (i in 2:iter) {

    
    ##############
    # update alpha
    ##############

    # propose new value
    alpha.can <- rnorm(1, alpha.cur, sigma)

    # alpha is strictly positive, therefore any proposed value which is negative is automatically rejected.
    if (alpha.can > 0.0) {

      # otherwise, if it is positive the accept it with some probability:
    
      # compute the log-densities
      log.pi.cur <- loglik(alpha.cur, p.cur) + log(dgamma(alpha.cur, mu.alpha, nu.alpha))
      log.pi.can <- loglik(alpha.can, p.cur) + log(dgamma(alpha.can, mu.alpha, nu.alpha))

      # log-qratio is zero is we are doing a random Walk Metropolis
      log.q.ratio <- 0;

      u <- runif(1)
      if (log(u) < log.pi.can - log.pi.cur) {
        alpha.cur <- alpha.can
      }
    }

    ############
    # propose p
    ############
    p.can <- runif(1, 0, 1)

    
    # compute the log-densities
    log.pi.cur <- loglik(alpha.cur, p.cur) + log(dbeta(p.cur, mu.p, nu.p))
    log.pi.can <- loglik(alpha.cur, p.can) + log(dbeta(p.can, mu.p, nu.p))
    
    # log-qratio is zero, why? Because the density of U(0,1) is 1!
    log.q.ratio <- log(dunif(p.cur, 0, 1)) - log(dunif(p.can, 0, 1))
    
    u <- runif(1)
    if (log(u) < log.pi.can - log.pi.cur) {
      p.cur <- p.can
    }

    # store the output
    res[i,] <- c(alpha.cur, p.cur)

  }
    
  res
                                                                                          
}

We now run our MCMC code for this dataset and look at the trace plots:

out <- mcmc.hh(10000, 0.1)
plot(out[,1], type='l', main = "alpha")

plot(out[,2], type='l', main = "p")

Exercise 5

Investigate how correlated the posterior samples of \(\alpha\) and \(p\) are.

plot(out[,1], out[,2])

They don’t appear to be that correlated.

Exercise 6

Draw samples from the posterior distribution of the probability that a susceptible individual avoids infection from one infected household member.

We saw in the notes that this probability is equal to \(exp(-\alpha)\) and therefore we simply take posteriors samples of \(\alpha\) and transform them:

plot(exp(-out[,1]),type='l')

hist(exp(-out[,1]), main = "exp(-alpha)")

Exercise 7

In the current version of your MCMC algorithm the parameter \(p\) is updated via an independence sampler. Modify your code such that \(p\) is now updated using a random walk Metropolis. Does the mixing of your algorithm improve?

# this is an MCMC algorithm for Asian-Data household model
# iter: number of MCMC iteration
# sigma: the standard deviation for the random walk Metropolis algorithm

mcmc.hh.rw <- function(iter, sigma, sigma.p) {


  # values for the prior hyper-parameters
  mu.p <- 1.0
  nu.p <- 1.0

  mu.alpha <- 1.0  
  nu.alpha <- 10^(-3)


  # initial values for the Markov Chain
  alpha.cur <- 0.5;
  p.cur <- 0.8;
  
  # create a matrix to store the output.
  res <- matrix(NA, nrow = iter, ncol=2);
  res[1,] <- c(alpha.cur, p.cur);


  for (i in 2:iter) {

    
    ##############
    # update alpha
    ##############

    # propose new value
    alpha.can <- rnorm(1, alpha.cur, sigma)

    # alpha is strictly positive, therefore any proposed value which is negative is automatically rejected.
    if (alpha.can > 0.0) {

      # otherwise, if it is positive the accept it with some probability:
    
      # compute the log-densities
      log.pi.cur <- loglik(alpha.cur, p.cur) + log(dgamma(alpha.cur, mu.alpha, nu.alpha))
      log.pi.can <- loglik(alpha.can, p.cur) + log(dgamma(alpha.can, mu.alpha, nu.alpha))

      # log-qratio is zero is we are doing a random Walk Metropolis
      log.q.ratio <- 0;

      u <- runif(1)
      if (log(u) < log.pi.can - log.pi.cur) {
        alpha.cur <- alpha.can
      }
    }

    ############
    # propose p
    ############
    p.can <- rnorm(1, p.cur, sigma.p)
    
    if (p.can > 0 && p.can < 1 ) {

    
      # compute the log-densities
      log.pi.cur <- loglik(alpha.cur, p.cur) + log(dbeta(p.cur, mu.p, nu.p))
      log.pi.can <- loglik(alpha.cur, p.can) + log(dbeta(p.can, mu.p, nu.p))
    
      # log-qratio is zero because of the random-walk Metropolis step
      log.q.ratio <- 0
    
      u <- runif(1)
      if (log(u) < log.pi.can - log.pi.cur) {
        p.cur <- p.can
      }
    }
  
    # store the output
    res[i,] <- c(alpha.cur, p.cur)
  }
    
  res
                                                                                          
}
out.rw <- mcmc.hh.rw(10000, 0.1, 0.1)
plot(out.rw[,1], type='l', main = "alpha")

plot(out.rw[,2], type='l', main = "p")

Of course the mixing will depend on the choice of the standard deviations from the proposal distributions, but for the choices above we can compare the mixing of the two algorithms by looking at the ACFs:

par(mfrow=c(2,2))
acf(out[,1], lag.max = 30, main = "alpha")
acf(out[,2], lag.max = 30, main = "p (independence)")
acf(out.rw[,1], lag.max = 30, main = "alpha")
acf(out.rw[,2], lag.max = 30, main = "p (random walk)")

It appears that the mixing of \(p\) is better than the independence sampler (for the choices above at least).