Computational model of a vector-mediated epidemic

We discuss a lattice model of vector-mediated transmission of a disease to illustrate how simulations can be applied in epidemiology. The population consists of two species, human hosts and vectors, which contract the disease from one another. Hosts are sedentary, while vectors (mosquitoes) diffuse in space. Examples of such diseases are malaria, dengue fever, and Pierce's disease in vineyards. The model exhibits a phase transition between an absorbing (infection free) phase and an active one as parameters such as infection rates and vector density are varied.


I. INTRODUCTION
The spread of epidemics is an urgent problem in medicine and public health. The threat of an Ebola outbreak, the increasing number of people with AIDS, and the observation that diseases such as malaria and influenza still kill many people worldwide, justifies its importance. In epidemiology-the study of the occurrence, transmission, and control of disease-mathematical models are an important tool for quantifying spreading patterns and understanding the transmission process. In this paper, we discuss an epidemic model in which transmission is mediated by a vector, illustrating the application of ideas from statistical mechanics beyond the context of thermodynamics.
In many regions, epidemics of malaria, dengue fever or yellow fever are recurrent, costing many lives and resources in efforts to treat and possibly eradicate the disease. The spread of these diseases depends on a vector that transmits a parasite to humans, and in some cases, to animals. The vector is an infected female mosquito. The dynamics of the human-vector interaction can be summarized as follows. An infected vector, whose saliva contains the parasite, transmits the latter to a human host through a bite. Once in the host, the parasite passes through several stages, until it migrates to the red blood cells. A vector can become infected upon biting an infected host. Within the vector, the parasite also undergoes several phases and finally migrates to the salivary glands, repeating the cycle.
In some cases, the predictions of mathematical models can guide immunization programs, or influence the choice of techniques to eliminate the disease. 1 The epidemiology of any disease is too complex to be described by a single model, which, according to Ref. 2, "should be used to identify and answer specific questions." The first model of transmission of a vector-borne disease was formulated by Ross in 1911. 3,4 Ross considered the dynamics of two populations, human hosts and vectors, ruled by rates of recovery, birth, death and biting, and described by a pair of differential equations for the densities of susceptible and infected individuals. Ross identified the existence of a threshold vector density, below which the population would be free of the disease in the long-time limit. He concluded that programs to control the disease should concentrate on keeping the vector population below a limiting value.
In this paper, we introduce a lattice model for vectormediated transmission of a disease in a population consisting of two species, human hosts and vectors (mosquitoes), which contract the disease from one another. Hosts are sedentary, while vectors diffuse in space. Our model is based on Ross' work but includes spatial structure and the diffusion of vectors. It is an agent-based stochastic process, in contrast to Ross' model which treats average densities in a deterministic fashion. Thus, our model includes fluctuations, which are ignored in Ross' mean-field approach. We investigate the dynamic behavior of the model using Monte Carlo simulations.
In the following, we discuss the construction of the model, basic critical behavior concepts and absorbing-state phase transitions, special features of the algorithm, and some of our results. the host and the vector. Hosts can be infected only through the bite of infectious vectors, and vectors can become infected only by biting infectious hosts. Hence, a model of a vector-borne disease must include host and vector populations. From now on, we refer to the human host simply as the host and to mosquitoes as vectors.
In simpler models, the sizes of both populations are fixed, comprising N h hosts and N v vectors. Because hosts typically live much longer than vectors, we interpret the invariability of the populations differently. For hosts, it means that birth and death are not important on the time scale of the epidemic. The fixed vector population size follows from the simplifying hypothesis that each vector that dies is immediately replaced by a new, uninfected one.
The arrival of disease-free vectors is represented by the vector replacement rate. We note that even if the average vector population were constant in the region of study, we would expect it to fluctuate about its mean value. In the model, fluctuations of the total number of vectors are ignored; locally, there are fluctuations as vectors hop between lattice sites.
The model is defined on a lattice of N sites. It is a stochastic model, more specifically, a Markov process, defined by its state space and a set of transition rates between different states. At each lattice site there is a host, which can be in one of two states, infected or healthy. (Thus N h ¼ N.) Each site can either be free of vectors or occupied by infected and/or healthy vectors. Thus, the state or configuration of the model is specified by the following set of random variables. At each site j, we define h j ¼ 0 or 1 representing, respectively, a healthy or infected host at that site, and nonnegative integers v 0;j and v 1;j representing the number of healthy and infected vectors, respectively. Because the total number of vectors is fixed, we have P N j¼1 ðv 0;j þ v 1;j Þ ¼ N v . The dynamics of the model is given by the following rules: • Vectors hop to a neighboring site at a rate D. • A healthy vector becomes infected at rate I v if it shares a site with an infected host. • An infected vector is replaced by a healthy one at rate R v . • A healthy host at site j becomes infected at rate v 1;j I h . • An infected host recovers at rate R h .
These rules are readily translated into transition rates. Suppose, for example, that C is a configuration with vector numbers v 0;j and v 1;j at a site j, and C 0 is the configuration with v 0 0;j ¼ v 0;j À 1 and v 0 1;j ¼ v 1;j þ 1; all other variables are identical to those of C. The second rule implies the transition rate WðC ! C 0 Þ ¼ I v h j v 0;j . The other dynamic rules can also be expressed in terms of transition rates. Note that the rate for a vector to hop from site j to a neighbor site k is D=z j , with z j the number of sites that are neighbors of site j. Although we restrict our attention to regular lattices in the present work, the model can be implemented on any lattice or network structure; the network is specified by the list of pairs of neighbors.
Vector diffusion is the only mechanism for spreading the infection in the host population. Similarly, the parasite passes from one vector to another only via an intermediary host. We assume that any infected individual is also infectious, that is, an individual becomes infectious the moment it is infected. This simplification implies that we are not able to include information about incubation periods.
Our model is closely related to one defined by Macnadbay et al. 5 These authors use computer simulations to study the critical behavior of an epidemic in which the vector population is allowed to diffuse on the lattice, infecting a static population upon contact. Thus, in their model, individuals become infected instantaneously. In our model, the infection rate is finite, so that a healthy host may share a site with infected vectors, and vice-versa.

III. BASIC CONCEPTS
In this section, we introduce some of the concepts of nonequilibrium phase transitions. For this purpose, we introduce the contact process, the simplest spatial model exhibiting a phase transition between an active and an absorbing state.
In the contact process, 6-8 a Markov process that can be interpreted as a model for the spread of an epidemic, each site of the lattice represents an individual that may be infected or healthy. The infection spreads by direct contact between infected and healthy individuals, that is, between sites that are neighbors on the lattice. Infected individuals recover at unit rate and are then susceptible to reinfection. A healthy individual at site i becomes infected at rate n i k=z i , where n i is the number of infected neighbors. Because an individual must have at least one infected neighbor to become infected, the state in which all the individuals are healthy is absorbing; that is, the system cannot escape from this configuration.
Persistence of the epidemic is controlled by the infection parameter k. If k is small, extinction at long times is certain; large values of k assure that the infection spreads indefinitely. The boundary between persistence and extinction is marked by a critical point, denoted as k c . The critical parameter k c separates the two possible steady states the system can reach asymptotically, namely a disease-free or absorbing state and a surviving epidemic or active state, in which the stationary density of infected individuals, q, is nonzero. It turns out that k c marks a continuous phase transition. Because q is zero in one phase and is greater than zero in the other, it can be identified as the order parameter of the transition, just as the magnetization is the order parameter for a ferromagnetic-paramagnetic phase transition. At a continuous phase transition the order parameter increases continuously from zero as the infection parameter is increased beyond its critical value. As k approaches its critical value from above, the order parameter approaches zero as a power law, q $ ðk À k c Þ b . The asymptotic scaling of q is characterized by the critical exponent, b. Near the critical point, the system exhibits strong fluctuations, correlated over large times and distances. 9,10 The correlation length n diverges at criticality as n $ jk À k c j À ? ; (1) where ? is the correlation-length exponent. The relaxation time, s, the time it takes for a system to reach the steady state, diverges as where k is the relaxation-time exponent. [11][12][13] Another fundamental aspect of absorbing-state phase transitions is the propagation of activity in space and time, starting from a configuration having a single active site at the origin. 14 (The spread of an epidemic starting from a single infected individual is of great interest in epidemiology.) Here, we focus on P(t), the probability that the system has not entered the absorbing state at time t, and n(t), the mean number of infected individuals. In the subcritical regime, k < k c , P(t) and n(t) decay exponentially. In the supercritical regime, there is a nonzero probability that the activity persists indefinitely, so that PðtÞ ! P 1 > 0 as t ! 1, and nðtÞ / t d in a d-dimensional system. At the critical point, the process dies out with probability one, but the mean lifetime diverges. In the absence of a characteristic time scale, the asymptotic evolution follows power laws: (3) where d and g are additional critical exponents. The powerlaw dependence of P and n on time provides an effective criterion for estimating k c in numerical studies. The independence of the critical exponents on most details of the system reflects universality in critical phenomena. Models with the same set of critical exponents form a universality class. In general, a universality class is determined by global features such as dimensionality, symmetry group of the order parameter, and the range (long or short) of the interactions. Models possessing a continuous transition to an absorbing state generally belong to the same universality class as directed percolation. [15][16][17][18] The presence of a conserved quantity can alter critical behavior. 19,20 Although there is no proof, we expect the vector-borne epidemic model to show qualitative behavior analogous to that of the contact process, because it also exhibits a phase transition between an active and an absorbing state. Because the total vector population is conserved, we might expect the model to exhibit non-directed percolation scaling. It remains an open question to which universality class the vector-borne epidemic model belongs.

IV. SIMULATION ALGORITHMS
We now turn to the elaboration of algorithms for simulating the contact process and the vector-borne epidemic. Both models are Markov processes defined in continuous time (that is, in terms of transition rates). In continuous-time processes elementary events, such as infection and recovery in the contact process, occur one at a time; two such events never occur simultaneously. 21 To construct the algorithm, we need to answer three questions. Given the current configuration of the system: (1) what kind of event will occur next?, (2) where will it occur?, and (3) when will it occur?

A. Contact process
In the contact process each infected site has a rate of unity to recover and a rate of k to attempt infecting a nearestneighbor site. Thus if there are N 1 infected sites, the total transition rate is r ¼ N 1 ð1 þ kÞ, which means that the time s to the next event is an exponentially distributed random variable, pðsÞ ¼ re Àrs . In many cases, we simply replace the random variable s by its mean, 1=r, and advance the time counter by this amount at each step. 22 To decide where the event will take place, we choose one of the N 1 currently infected sites at random, say j, by randomly generating an integer uniformly distributed between 1 and N 1 . To choose the type of event, we note that a given event is recovery with probability p d ¼ 1=ð1 þ kÞ and is an infection attempt with the complementary probability, 1 À p d . Thus, we generate a uniform random number y in the interval [0,1) and recover the chosen site j if y < p d . If y ! p d , we choose one of the z j nearest neighbors of site j and infect this site if it is currently uninfected. (If the chosen neighbor is already infected, no changes are made to the configuration at this step.) Whenever the configuration changes, the list of infected sites must be updated. Despite a small overhead associated with maintaining the list, this algorithm provides a highly efficient method for simulating the contact process.
Problem 1. Explain why it would be incorrect to use the same time increment, independent of N 1 , at each step, in the algorithm we have described.
Problem 2. Explain why it would be incorrect to choose another nearest neighbor of site j if the first choice yields an already infected site.
Problem 3. Write a set of commands to remove an uninfected site from the list of infected sites; the number of operations should be independent of the list size.

B. Vector-borne disease: Continuous-time algorithm
The algorithm used to simulate a vector-borne epidemic model is considerably more complicated than that used for the contact process, because we have two classes of individuals in the process. To choose the next event, we note that the total transition rate R is the sum of the transition rates for five possible events where the terms represent host recovery, vector replacement, vector hopping, vector infection, and host infection, respectively, V 1 is the number of infected vectors and H 1 the number of infected hosts. We refer to these as events of type 1, …, 5, respectively. The probability that the next event is of type k is given by the ratio of its transition rate to the total rate, so that, for example, the probability that the next event is recovery of a host is To choose the next event type, we generate a uniform random number y in the interval ½0; 1Þ, as in the contact process algorithm. If 0 y < p 1 , the next event is of type 1; if p 1 < y < p 1 þ p 2 , it is of type 2, and so on.
To implement this scheme, we need lists of infected hosts, infected vectors, pairs of infected hosts and healthy vectors occupying the same site, and pairs of healthy hosts and infected vectors occupying the same site. The lists need to be updated following each event. Moreover, for diffusion, we require an array storing the current position of each vector.
The algorithm involves the following steps: (1) Initialize the system, defining the states (infected or not) of each host and vector, and distribute the vectors over the lattice, randomly.
(2) Determine which kind of event will occur next.
(3) Choose the entity (for example, infected host) involved in the event from the appropriate list. (4) Following the event, update the lists and (in case of diffusion) vector positions, and increment the time, t ! t þ 1=R. (5) Go to step 2.

C. Vector-borne disease: Discrete-time algorithm
The algorithm we have described for a vector-borne disease involves considerable expenditure for choosing events and maintaining lists. We turn now to a simpler algorithm involving discrete time: in this case the entire system is updated simultaneously in a pass of small but finite duration, Dt. The algorithm employs the variables v 0;j and v 1;j , and a logical variable k j which is true if site j harbors an infected host and is false if not. There is no need to record individual vector positions or maintain lists of the kind used in the continuous-time approach.
Once the initial states of the individuals have been defined and the vectors distributed over the lattice, the evolution proceeds in a series of substeps: Host recovery/infection and vector infection. At each site j, if the host is infected, then the host recovers with probability r h 1 À expðÀR h DtÞ. If there are uninfected vectors at site j (that is, v 0;j > 0), n of them become infected, where n is a binomial random number with for n ¼ 0; 1; 2; …; v 0;j . If the host at site j is not infected, and the site harbors infected vectors, the host becomes infected with probability 1 À expðÀv 1;j I h DtÞ.
Vector replacement. At each site j, if there are infected vectors, n of them are replaced with uninfected ones, where n is a binomial random number with for n ¼ 0; 1; 2; …; v 1;j . Vector hopping. At each site j, if there are infected vectors, n of them hop, where n is a binomial random number with PðnÞ ¼ v 1;j n ½1 À expðÀDDtÞ n exp½Àðv 1;j À nÞDDt for n ¼ 0; 1; 2; …; v 1;j . Choose the new site for each hopping vector from the set of neighbors of site j. Apply the same procedure to the uninfected vectors. The binomial probability distributions associated with vector infection, replacement and hopping are stored in lookup tables. To avoid multiple hopping of the same vector in a single step, all hopping events are generated before any vectors are actually transferred. For each site, let D 1;j be the change in the number of infected vectors at site j due to hopping; at the beginning of the hopping substep, all the D 1;j are set to zero. Suppose, for example, that two infected vectors are to be transferred from site j to j 0 . Then we let D 1;j ! D 1;j À 2 and D 1;j 0 ! D 1;j 0 þ 2. Once all sites have been visited in the hopping substep, we let v 1;j ! v 1;j þ D 1;j for each site j. Naturally, the same procedure is applied to the uninfected vectors.
In this discrete-time simulation, many events can occur in a single sweep of the lattice. We nevertheless want to keep the time increment Dt sufficiently small such that the probability of recovery and subsequent reinfection of the same individual during the same step is small. Thus if R max is the largest of the rates (of infection, recovery/ replacement, or hopping), we need to have R max Dt ( 1. For instance, in the studies discussed in the following R max ¼ I v ¼ 2 and Dt ¼ 0:1. The results of the discrete-and continuous-time simulations agree in the limit Dt ! 0, but letting Dt be very near zero is impractical: enormous cpu time would be spent while almost nothing happens. Because we must use a finite Dt, the value of a critical parameter such as k c will be somewhat different in the discrete-and continuous-time simulations. However, the results for universal properties such as critical exponents should be the same for both methods.
Because the discrete-time scheme is somewhat complicated, we tested it first on the one-dimensional contact process. For Dt ¼ 0:1, we obtained k c ¼ 2:81, compared with the known value of k c ¼ 3:297848 for the continuous-time process. 8 Despite this difference, the discrete-time simulations yield critical exponents that agree with the known values for the contact process.
The methods we have described produce a single realization of the process. A given realization ends when either the process has reached an absorbing configuration, or some maximum time t max , defined by the user, is attained. In spreading simulations, the process must be repeated a large number of times, N rep , to obtain good statistics for P(t) and n(t). A typical strategy is to declare vectors P(j) and n(j) for each integer time j from zero up to t max . In a given realization, these variables are updated whenever the simulation time variable hits (or just surpasses) the next integer value j, so PðjÞ ! PðjÞ þ 1, and nðjÞ ! nðjÞ þ n 1 , where n 1 is the number of infected individuals. (In the vector-borne epidemic model, we require two counters, n h ðjÞ and n v ðjÞ, for hosts and vectors, respectively.) Once all realizations are completed, normalizing these vectors by N rep yields the survival probability and mean infected population size. 23

V. RESULTS
In the following, we provide some illustrative results for the vector-borne epidemic model in one dimension; more definitive conclusions on critical behavior will be given elsewhere. 24 (Although it might seem more realistic to use a twodimensional lattice to map out a city or region in which the epidemic takes place, we can think of the one-dimensional case as representing a population living along a river.) A fundamental piece of information about the process is the survival threshold; that is, the surface in parameter space separating the regime in which the epidemic persists from that in which it dies out. As discussed, this critical surface marks a continuous phase transition. Given the large number of parameters, we do not propose to delineate this surface completely. We shall instead fix most of the parameters, and select one as the control parameter and seek its critical value. By repeating this analysis using different parameter values, we can trace out the critical surface.
For both continuous and discrete-time simulations, we study spreading from a single infected individual. Thus, the initial configuration is characterized by one infected host. We adopt periodic boundary conditions for the lattice.
In the continuous-time simulations, we choose the vector density q v as the control parameter. With the other parameters fixed at R h ¼ 1:0; R v ¼ 0:5; I h ¼ 1:0; I v ¼ 2:0; and D ¼ 0.5, we vary q v to find its critical value. One way of locating the critical point is by the analysis of the curvature of the survival probability, P(t), a quantity furnished by spreading simulations. As an illustration of the method, we show a log-log plot of P(t) as a function of t in Fig. 1 for q v varying from 2.14 to 2.19. These data were obtained using a system size of L ¼ 5000, and a maximum time of t max ¼ 8000; the results represent an average over N rep ¼ 50 000 realizations.
As discussed in Sec. III, the critical point is marked by an asymptotic power-law decay of the survival probability. Thus our best estimate for the critical vector density q v;c is the value that yields a log-log plot of P(t) showing the least curvature. In Fig. 1, the curves for q v ¼ 2:19 and 2.18 clearly veer upward, while those for q v ¼ 2:14 À 2:16 veer downward, giving q v;c ¼ 2:17 as our best estimate. The critical exponent d is (minus) the asymptotic slope of the corresponding curve.
Due to finite-time corrections to scaling, the slope, even for the exact value of q v;c , exhibits some variation at short times. To estimate the asymptotic slope as t ! 1, we compute the slope over blocks of data representing a fixed interval of lnt, and plot the slope versus 1=t. Figure 2 shows that such a local-slope plot is highly sensitive to deviations from criticality, making it easier to decide which curve is nearest criticality. At this level of precision, the critical vector density lies between 2.16 and 2.18, providing q v;c ¼ 2:170 ð5Þ, where the uncertainty estimate reflects the fact that the values 2.16 and 2.18 are excluded. To improve this estimate, we would have to run the simulation for larger values of L, t max , and N rep .
The local-slopes plot allows us to estimate the value of d by extrapolation of the slope to infinite time, yielding d ¼ 0:126 ð5Þ.
We next consider the critical vector density dependence on the diffusion rate, leaving the other parameters fixed. The results are shown in Fig. 3 from which we can conclude that, as expected, diffusion facilitates the spread of the epidemic, so that the larger the value of D, the smaller the threshold density q v;c . The phase boundary data are well fit by a power law of the form q v;c ¼ 1:69D À0:32 ; which relates the critical vector density and the diffusion rate. We performed complementary discrete-time simulations using the host infection rate I h as the control parameter. The other parameter values are fixed at R v ¼ R h ¼ 0:5, I v ¼ 2:0, D ¼ 0.1, and q v ¼ 3. Spreading simulations for a maximum time of 50 000 on lattices of L ¼ 4000 sites yield the critical value I h;c ¼ 0:853ð3Þ.

VI. CONCLUDING REMARKS
We have discussed methods used to simulate the spread of a vector-borne epidemic in a static population. The algorithms are readily adapted to study other nonequilibrium processes. In particular, we have studied a model of malaria transmission based on Ross' model, but including fluctuations, spatial structure, and diffusion of vectors. 3 The model is of interest in the context of statistical mechanics because it exhibits a phase transition.
A simple model can be very useful for understanding the principal features of real-world complex systems or processes. We note that the vast majority of models in the epidemiology literature are deterministic and lack spatial structure. Including stochasticity and spatial dependence may lead to improved predictions of epidemic thresholds.
We discussed continuous-time and discrete-time algorithms, providing different approaches to implementing the model. Although the continuous-time algorithm is more faithful to the original Markov process, the discrete time implementation is more efficient computationally. The results of the two approaches agree qualitatively but differ somewhat in details. Because there is usually significant uncertainty regarding the true values of the parameters (transition rates) used in the model, the differences between predictions furnished by the two simulation strategies may not be important. Therefore qualitative results on, say, how to reduce the likelihood of a large epidemic may be of more utility than precise numerical predictions.
Universal properties such as critical exponents have attracted the interest of physicists who study phase transitions but are of limited interest in epidemiology. The simulation methods described here may nevertheless be useful in this broader, more applied context.
We hope that the methods discussed in this paper inspire interested students to consider studying statistical physics or at least appreciate some of the problems currently studied in this area.

VII. SUGGESTED PROJECTS
(1) Given the set of parameters R h ¼ 1:0; R v ¼ 0:5; I h ¼ 1:0; I v ¼ 2:0, D ¼ 0.5, and q v ¼ 2:17, run the simulation for the one-dimensional vector-borne epidemic model for a lattice size L ¼ 5000, a maximum time of t max ¼ 5000 and N rep ¼ 50 000 realizations. Compare the values obtained for the survival probability P(t) at different times with the reference values shown in Table I. (2) Suppose that due to changes in topography or settlement density, the distance between human host habitations varies in space. We will represent each habitation by a lattice site and represent this situation by a positiondependent diffusion rate, with larger separations corresponding to smaller rates. Adapt the algorithms we have described to the case of a site-dependent diffusion rate D i . To model heavily and sparsely populated regions, for example, D i could take two values, generating clumps of sites with the same value. Determine in which regions the epidemic is more likely to persist, and where the density of infected hosts and vectors is greatest. (3) The algorithms we have described require that a new realization be started each time the absorbing state is reached. A useful alternative, especially in the vicinity of the critical point, is quasistationary simulation, in which the absorbing state is effectively removed from the state space. The method is described in detail in Ref. 25. Here, we simply provide a recipe, which runs as follows. We maintain a collection of N s saved configurations, starting from the initial one (which might have all hosts and vectors infected). In the initial phase of the simulation, we save the current configuration to the list at each time step.
Once N s configurations are saved, we update the list, substituting a configuration on the list with the current configuration, with probability p r at each unit time interval. (The replacement probability p r is determined by the condition that the residence time, N s =p r of a configuration on the list be long compared to the mean lifetime of the process, yet short compared to the duration of the simulation.) The evolution of the process proceeds as before, except when an absorbing configuration is reached. In this case, the latter is exchanged for one of the saved configurations, which, by construction, is active. Following an initial transient, this modified process attains the quasistationary distribution; that is, the probability distribution conditioned on not having visited the absorbing state. Averages over this distribution correspond to asymptotic long-time properties of the active process and may be used to infer the critical properties. Use quasistationary simulations to determine the infected host and vector densities as a function of one of the parameters. Convenient values for the quasistationary scheme are N s ¼ 1000 and p r ¼ 0:001.
ACKNOWLEDGMENT R.D. is grateful to CNPq, Brazil for financial support. a)