More Models of Infection: It's Epidemic

Previously, we studied a model of an infection's spread through a hospital ward. The ward was small enough that we could track each patient individually, but when population size grows, this kind of model becomes impractical; accordingly, we turn our attention in this paper to models that study the population as a whole. As before, we divide the population into three groups: at day t, I(t) is the infected proportion of the population, whereas S(t) is the proportion that has never been infected. These quantities satisfy 0 /spl les/ I(t) /spl les/ 1 and 0 /spl les/ S(t) /spl les/ 1 for t /spl ges/ 0. We derive the third part, R(t) the proportion of the population that was once infected but has now recovered - from the first two: R(t) = I - I(t) - S(t).


Models without Spatial Variation
In the models we studied before, an individual's probability of becoming infected depended on the status of his or her neighbors. In our next model, we ignore that dependence, which is equivalent to assuming a "well mixed" model: all members of the population have mutual contact. How might we model the three groups in this population? If the infection (or at least its contagious phase) lasts k days, we might assume that the recovery rate is equal to the number infected divided by k. Thus, on average, 1/k of the infected individuals will recover each day.
Let τ be the proportion of encounters between an infected individual and a susceptible one that transmit the infection. The rate of new infections should increase as any of the parameters I, S, or τ increases, so we can model this rate as τI(t)S(t).
Next, we take the limit as the "time step" ∆t goes to zero, obtaining a system of ordinary differential equations (ODEs). This gives us a simple, but interesting, Model 1: (1) We start the model by assuming some proportion of infected individuals-for example, I(0) = 0.005, S(0) = 1 -I(0), and R(0) = 0. Problem 1. Run Model 1 for k = 4 and τ = 0.8 until either I(t) or S(t) drops below 10 -5 . Plot I(t), S(t), and R(t) on a single graph. At the end of the computation, report the proportion of the population that became infected and the maximum difference between I(t) + S(t) + R(t) and 1.
Instead of using the equation dR/dt = I/k, we could have used the conservation principle The model has many limitations, but one of them is that the recovery rate is proportional to the current number of infections. This means that we aren't very faithful to the hypothesis that each individual is infected (and infectious) for k days. One way to model this more closely is to use a delay differential equation (DDE). We modify Model 1 by specifying that the recovery rate at time t is equal to the rate of new infections at time tk. This gives us Model 3: (3) One disadvantage of Model 3 is that we must specify initial conditions not just at t = 0, but also for -k ≤ t ≤ 0; thus we need a lot more information. A second disadvantage is that functions I, S, and R probably will have discontinuous derivatives (for example, at t = 0 and t = k, when we switch between dependence on the initial conditions and dependence only on the integration history). This causes solvers to do extra work at these points of discontinuity.

Models that Include Spatial Variation
Epidemics vary in space as well as time. They usually start in a single location and then spread, based on the infected individuals' interactions with their neighbors. Models 1, 2, and 3 lose this characteristic, so now we let S, I, and R depend on a spatial coordinate (x, y) as well as t and see what such a model predicts.
Because people move in space, we introduce a diffusion term that lets infected individuals affect susceptible individuals close to them in space. Diffusion adds a term δ((d 2 I)/(dx 2 ) + (d 2 I)/(dy 2 ))S to dI/dt and subtracts the same term from dS/dt. This produces differential equations analogous to Model 1: We assume that the initial values I(0, x, y) and S(0, x, y) are given, that we study the problem for 0 ≤ x ≤ 1, 0 ≤ y ≤ 1, and t ≥ 0, and that there is no diffusion across the boundaries x = 0, x = 1, y = 0, and y = 1.
To solve this problem, we discretize and approximate the solution at the points of an n × n grid. Let h = 1/(n -1), let x i = ih, i = 0, …, n -1, and let y j = jh, j = 0, …, n -1. Our variables will be our approximations I(t) ij ≈ I(t, x i , y j ) and similarly for S(t) ij and R(t) ij .

Problem 4.
a. Use Taylor series expansions to show that we can approximate We can derive a similar expression for d 2 I(t, x i , y j )/dy 2 . b. Form a vector Î (t) from the approximate values of I(t) by ordering the unknowns as I 00 , I 01 , …, I 0,n-1 , I 10 , I 11 , …, I 1,n-1 , …, I n-1,0 , I n-1,1 , …, I n-1,n-1 . In the same way, form the vectors Ŝ (t) and R (t), and then derive the matrix A so that our discretized equations become Model 4: (4) where the notationÎ(t).* Ŝ (t) means that we form the vector from the product of each component of Î (t) with the corresponding component of Ŝ (t). To form the approximation near the boundary, assume that the (Neumann) boundary conditions imply I(t, -h, y) = I(t, h, y), I(t, 1 + h, y) = I(t, 1h, y) for 0 ≤ y ≤ 1, and similarly for S and R. Make the same type of assumption at the two other boundaries.
We can use this model in two ways. First, suppose we fix the time step ∆t and use Euler's method to approximate the solution. This means we approximate the solution at t + ∆t by the solution at t, plus ∆t times the derivative at t, which gives us an iteration: This model is very much in the spirit of the models we considered in the last issue-except that it's deterministic instead of stochastic.
Alternatively, we could apply a more accurate ODE solver to this model, as we do in the next problem. b. Let's vaccinate the susceptible population at a rate of νS(t, x, y)I(t, x, y)/(I(t, x, y) + S(t, x, y)). This rate is the derivative of the vaccinated population V(t, x, y) with respect to time; we subtract this term from ∂S(t, x, y)/∂t. So now we model four segments of the population: susceptible S(t), infected I(t), recovered R(t), and vaccinated V(t). Your program can track three of these and derive the fourth from the conservation principle S(t) + I(t) + R(t) + V(t) = 1. Run this model with ν = 0.7, and compare the results with those of Model 4.
If you want to experiment further with Model 4, incorporate the delay recovery term in place of .
I n the models we used in the last issue, we incorporated some randomness to account for any factors not explicitly modeled. We also could put randomness into our differential equation models, resulting in stochastic differential equations. (See the "Tools") sidebar for references on this subject.) Acknowledgments I'm grateful to David Gilsinn for explaining delay differential equation models to me.
We have mn patients in a hospital ward, and one of them becomes infected. We track I(t), the proportion of the infected population; S(t), the proportion of the population that never has been infected, and R(t), the remaining proportion. We let τ be the probability of being infected by a sick neighbor.  Answer: Figure 1 shows the simulation results for each of these three models. (The Matlab program that generated the results is at www.computer.org/cise/homework.) Generally, mobility increases the infection rate and vaccination dramatically decreases it. In our sample runs, the infection peaks Tools T he Matlab function ode23s provides a good solver for Problem 1's ordinary differential equations (ODEs). Most ODE software provides a mechanism for stopping the integration when some quantity goes to zero; in ode23s, using the Events property in an option vector accomplishes this. Charles van Loan's book 1 provides a good introduction to the numerical solution of ODEs; more specialized texts cover the reasons for preferring a stiff solver like ode23s for certain types of ODEs. 2 For Problem 2, we can use ODE software, including ode23s, to solve certain differential algebraic equations (DAEs); in Matlab, using the Mass property in the option vector accomplishes this. Model 2 is a very simple DAE; Kathryn Brenan, Steven Campbell, and Linda Petzold's book provides more information on the theory and solution of such problems. 3 Delay differential equations (DDEs) such as those in Problem 3 arise in many applications, including circuit analysis. To learn more, consult a text such as Richard Bellman and Kenneth Cooke's book 4 or Jack Hale and Sjoerd Lunel's book. 5 In Matlab (Release 13), we can solve certain DDEs by using dde23.
Stochastic differential equations are an active research area. Desmond Higham 6 gives a good introduction to computational aspects and supplies references for further investigation.
Model 1 is Kermack and McKendrick's SIR model, first introduced in 1927. Nicholas Britton discusses it in more detail. 7 James Callahan presents the differential equations leading to Model 4,8

Problem 5.
Develop a vaccination strategy that will, on average, limit the epidemic to 20 percent of the population. Do this by using a nonlinear equation solver to solve the problem R(ν) -0.2 = 0, where R(ν) is the mean number of recovered individuals when we use a vaccination rate of ν. For each value of ν the solver presents, you will need to get a reliable estimate of R by running the model multiple times. Use Problem 4's variance estimates to determine how many runs to use, and then justify your choice.
Answer: From Problem 4, we know that a very low vaccination rate (somewhat less than nu = 0.1) is sufficient to dramatically reduce the infection rate. But using a nonlinear equation solver on a noisy function is quite dangerous; it is easily fooled by outliers, and changing the starting guess, you can make it produce almost any value.

Problem 6.
a. Construct the transition matrix A corresponding to this Markov chain: element a i,j is the probability of transitioning to state i from state j.
b. Let e 1 be the column vector with 1 in position 1 and zeroes elsewhere. If we begin in day one in the first state, then vector Ae 1 tells us the probabilities of being in each of the states on day two. Prove this.
c. Similarly, A 2 e 1 gives the probabilities for day three. For efficiency, this should be computed as A(Ae 1 ) rather than as (A 2 )e 1 . Explain why, by doing the operations counts. d. If we compute z = A j e 1 for a large enough j, we will have the (exact) probabilities of being in each state after the epidemic passes. Use this fact to compute the probabilities of having one, two, or three infected individuals, and compare these probabilities with the results of a Monte Carlo experiment as performed in Problem 4 but using three individuals. How many Monte Carlo simulations does it take to get two digits of accuracy in the probabilities? e. In this simple problem, you can determine the three probabilities directly from Figure 3, by determining the probability of a transition from state A to states P, Q, R, and S. Show how to derive these probabilities, giving the same answer as you obtained via the Markov chain computation earlier.

Answer:
a. Figure 3 gives the transition probabilities; the matrix is given in the Matlab code on the Web site.
b. Ae 1 is equal to column 1 of A, which contains the probabilities of transitioning from state 1 to any other state. Generally, if p is a vector of the probabilities of initially being in each of the states, then Ap is the vector of probabilities of being in them at time 1.
c. Computing A(Ae 1 ) costs 2s 2 multiplications, where s is the number of states. Computing (A 2 )e 1 costs s 3 + s 2 multiplications, which grows quite a bit larger when s is large. We should also take advantage of the zeros in A and avoid multiplying by them. If we do this for our matrix, then A has 21 nonzero elements whereas A 2 has 23, so again it takes more multiplications to form (A 2 )e 1 than to form A(Ae 1 ). We also should note that the product Ae 1 is just the first column of A, so we could compute it without multiplications. d. In this experiment, it took 280 Monte Carlo simulations to get two digits of accuracy. Asking for three digits raises the number of trials into the 10,000s because the variance is high relative to threshold.
e. There is only one path to state Q (corresponding to a single infection), and the product of the probabilities of transitions along this path are (1τ) 4 . There are two paths to state S; summing the product of the probabilities along the paths gives (τ(1 τ) 2 + τ(1τ) 3 ). The probability of reaching state P is the same, so the probability of two infections is twice this number. Similarly, the probability of reaching state R, corresponding to three infections, is τ 2 + 2τ 2 (1τ) + (1τ) 2 τ 2 .