Evaluation of vaccination strategies for SIR epidemics on random networks incorporating household structure

This paper is concerned with the analysis of vaccination strategies in a stochastic susceptible \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\rightarrow $$\end{document}→ infected \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\rightarrow $$\end{document}→ removed model for the spread of an epidemic amongst a population of individuals with a random network of social contacts that is also partitioned into households. Under various vaccine action models, we consider both household-based vaccination schemes, in which the way in which individuals are chosen for vaccination depends on the size of the households in which they reside, and acquaintance vaccination, which targets individuals of high degree in the social network. For both types of vaccination scheme, assuming a large population with few initial infectives, we derive a threshold parameter which determines whether or not a large outbreak can occur and also the probability of a large outbreak and the fraction of the population infected by a large outbreak. The performance of these schemes is studied numerically, focusing on the influence of the household size distribution and the degree distribution of the social network. We find that acquaintance vaccination can significantly outperform the best household-based scheme if the degree distribution of the social network is heavy-tailed. For household-based schemes, when the vaccine coverage is insufficient to prevent a major outbreak and the vaccine is imperfect, we find situations in which both the probability and size of a major outbreak under the scheme which minimises the threshold parameter are larger than in the scheme which maximises the threshold parameter.


Introduction and description of results
Mathematical models for the spread of infectious disease have much to offer in terms of understanding past outbreaks, predicting likely behaviours of future outbreaks and predicting the effect of interventions or mitigating strategies. In the last decade or two there has been considerable interest and work on network epidemic models. These involve supplanting the traditional assumption of homogeneous mixing of homogeneous individuals with some random graph structure, with specific interest in being able to control the degree distribution, reflecting the varying numbers of people with which different individuals tend to interact. Other structures, for example including households and stratification of populations, have been studied for longer (Bartoszyński 1972;Ball et al. 1997;Watson 1972;Scalia-Tomba 1986); but structures with a 'social network' type of interpretation start around the turn of the millenium with the works of Andersson (1997Andersson ( , 1998, Diekmann et al. (1998) and Newman (2002). In this and most other papers in the field we typically have in mind an infection spreading through a human population. However, much the same ideas apply to mathematical models of a variety of other motivating applications, such as the spread of rumours or information through human populations, infection or information spread through a population of other animals and virus spread through a network of computers.
In this paper we build on the model of Ball et al. (2009Ball et al. ( , 2010 which includes household and network structure to include vaccination, with some emphasis on socalled acquaintance vaccination (Cohen et al. 2003;Britton et al. 2007) as elucidated in Ball and Sirl (2013) in a model without household structure. In the model of Ball et al. (2010), a population of fixed size is given social network structure via the configuration model random graph (see e.g. Bollobás 1980;Newman et al. 2001;Newman 2002) and the population is also partitioned into households (see e.g. Ball et al. 1997). A stochastic Susceptible → Infective → Removed (SIR) epidemic model is then defined on this population structure. A first quantity of interest in this model is the final size, which is the (random) number of initial susceptibles that are infected at some point during the epidemic. In line with much of modern stochastic epidemic theory, one can use branching process approximations to prove a threshold theorem (valid in the large population limit) which determines whether the infection will necessarily die out relatively quickly, resulting in a small final size, or whether it is possible for the epidemic to take off and infect a substantial fraction of the population. These methods also yield approximations for the probability that a supercritical epidemic will take off and using closely related methods one can also study final size properties of such a large outbreak.
This paper provides tools for studying the effect of introducing vaccination into this model. Households-based vaccination schemes are those that can be described in terms of the distribution of the number of vaccinated individuals in households of size n, for every household size n in the population. This includes as special cases the situation when we vaccinate individuals who are chosen uniformly at random from the population (the distributions are binomial) and vaccinating households at random (the distributions are concentrated at 0 and n). Optimal schemes in this context often resemble the equalising strategy (Ball et al. 1997, Sect. 5.2) where one vaccinates preferentially in larger households, there being more of a herd immunity effect available to exploit in those larger households on top of the direct protection of vaccinated individuals. Acquaintance vaccination schemes exploit the heterogeneity of individuals' connectivities in a network to preferentially target better-connected individuals for vaccination. Instead of vaccinating individuals chosen from the population in some way, one samples individuals and then vaccinates their friends-their acquaintances in the network. This exploits the so-called friendship paradox: the observation that, for most people, their friends on average have more friends than they do (Feld 1991).
Our main theoretical results are the calculation of asymptotic final size quantities, via appropriate branching process approximations, when we include vaccination in Ball et al.'s (2010) household-network model. This extends Becker and Starczak's (1997) and Lyne's (2002, 2006) results on the standard households model to have network-based (rather than homogeneous mixing) casual contacts. It also extends Ball and Sirl's (2013) results on acquaintance vaccination in a population with network (but not household) structure. We also explore the model numerically and find that there can be substantial differences in the performance of the different vaccine allocation strategies.
Results like those in this paper can inform about which mechanism of spreading (household or network) can be most beneficially targeted for reducing the impact of outbreaks of disease. This seems likely to be relevant when considering, for example, how best to reduce the impact of any future Ebola outbreak in West Africa (Adams 2016) using vaccines that are currently under development. Other possible applications include pandemic influenza (Halloran et al. 2008) and smallpox (Halloran et al. 2002). For all of these diseases a salient feature is that they have markedly enhanced transmission within small social groups such as households. In each of these possible applications some further refinements of the underlying model may be valuable.
The remainder of the paper is structured as follows. In Sect. 2 we specify our models for the population structure and evolution of the epidemic, then outline the analysis of the final outcome of the epidemic and lastly introduce the model we use for the action of a vaccine on individuals who receive it. In Sect. 3 we consider the effect of householdsbased vaccination, including optimal households-based strategies, and in Sect. 4 we consider the effect of acquaintance vaccination. The analysis in these sections is of the same final outcome measures as in the basic model in Sect. 2. Some exploration of the behaviour of the model (mainly numerical) is presented in Sect. 5. Lastly we offer some concluding remarks in Sect. 6. Details of many of the calculations relating to Sect. 4 are given in Appendix A, while in Appendix B some known exact results for the final outcome and susceptibility sets of multitype stochastic SIR epidemics (households in our setting) are stated and applied to the asymptotic analyses of Sects. 2, 3 and 4.

Model, threshold behaviour and vaccination 2.1 Model
The model we consider is one for the spread of an SIR epidemic on a finite random network incorporating household structure (Ball et al. 2010). We assume that the population consists of N individuals and is partitioned into m households, of which m n are of size n (n = 1, 2, . . .). Thus m = ∞ n=1 m n and N = ∞ n=1 nm n . The network of possible global (i.e. between-household) contacts is constructed using the configuration model (with random, rather than specified, degree sequence). Thus each individual is assigned a number of 'half-edges' independently, according to an arbitrary but specified discrete random variable D having mass function P(D = k) = p k (k = 0, 1, . . .). Then all such half-edges are paired up uniformly at random to form the edges in the graph describing the global network. If the total number of half-edges is odd, we ignore the single leftover half-edge.
Our analysis is asymptotic as the number of households m → ∞. We require that, as m → ∞, m n /m → ρ n (n = 1, 2, . . .), where (ρ 1 , ρ 2 , . . .) is a proper probability distribution (i.e. ∞ n=1 ρ n = 1) having finite mean μ H = ∞ n=1 nρ n . Thus μ H is the mean household size in the limiting population. We also require that μ D = E [D] is finite. These assumptions are sufficient for our analysis. If we make the stronger assumptions that σ 2 D = var(D) and ∞ n=1 n 2 ρ n are both finite, then parallel edges and self-loops, between either individuals or households, become sparse in the global network as n → ∞ (Ball et al. 2010, Sect. 1.2).
The epidemic is initiated by a single individual, chosen uniformly at random from the population, becoming infected, with the other individuals in the population all assumed to be susceptible. The infectious periods of different infectives are each distributed according to a random variable I , having an arbitrary but specified distribution, which is most conveniently specified in terms of its moment generating function φ I (θ ) = E[e −θ I ] (θ ≥ 0). Throughout its infectious period, a given infective makes infectious contact with any given member of its household at the points of a Poisson process having rate λ L and with any given global neighbour at the points of a Poisson process with rate λ G . (Note that λ L and λ G are both individual to individual contact rates.) If an individual so contacted is susceptible then it becomes infected, otherwise the contact has no effect. Contacted susceptibles are immediately able to infect other individuals. An infective individual becomes removed at the end of its infectious period and plays no further role in the epidemic. All infectious periods, global degrees and Poisson processes are assumed to be mutually independent. The epidemic ceases as soon as there is no infective in the population.
For ease of exposition we assume throughout that there is no latent period and that the epidemic is started by a single infective chosen uniformly at random from the population. As explained by Ball et al. (2010), these assumptions may be relaxed without compromising mathematical tractability. In particular, our results are related to the final outcome of the epidemic, the distribution of which is invariant to very general assumptions concerning a latent period (see e.g. Pellis et al. 2008).

Early stages of epidemic
Recall that the global network is formed by pairing up the half-edges uniformly at random. It follows that in the early stages of an epidemic the probability that a global contact is with an individual residing in a previously infected household is small, indeed it is zero in the limit as m → ∞. Thus, in the early stages of an epidemic, the process of infected households can be approximated by a branching process. The individuals in this branching process correspond to infected households in the epidemic process, and the offspring of a given individual in the branching process are all households that are contacted globally by members of the local (within-household) epidemic in the parent household.
The offspring distribution for this branching process is usually different in the initial generation than in all subsequent generations. The number of global neighbours that the initial infective in the population may infect is distributed according to D, as that individual is chosen uniformly at random from the population. The number of global neighbours that the initial infective in any subsequent infected household (in the branching process approximation) may infect is distributed according toD − 1, whereD is the degree of a typical neighbour of a typical individual in the network. The −1 arises because such an infective has been infected through the global network, so one of its neighbours (i.e. its infector) is not available for further infection. Note that a given half-edge is k times as likely to be paired with a half-edge emanating from an individual with degree k than with one emanating from an individual with degree 1, so P(D = k) = μ −1 D kp k (k = 1, 2, . . .). The distributions of D andD − 1 are equal if and only if D has a Poisson distribution. For any non-negative integer valued random variable X , we denote its probability generating function (PGF) by f X , so f X (s) = E[s X ] (0 ≤ s ≤ 1). We note for future reference that fD −1 (s) = f D (s)/μ D .
The approximation of the early stages of the epidemic process by the above branching process is made mathematically fully rigorous by Ball et al. (2009) and Ball and Sirl (2012). The latter shows that a sequence of epidemic processes, indexed by m, and the approximating branching process can be constructed on the same probability space so that, as m → ∞, the number of households ultimately infected in the epidemic process converges almost surely to the total progeny of the branching process. Thus, provided m is large, whether or not the epidemic can become established and lead to a major outbreak is determined by whether or not the branching process is supercritical.
Let C andC be random variables describing the number of offspring of the initial and a typical subsequent individual, respectively, in the branching process. Then standard branching process theory (e.g. Haccou et al. 2005, Theorem 5.2), gives that the extinction probability of the branching process is strictly less than one if and only if R * = E[C] > 1. Thus R * serves as a threshold parameter for the epidemic model. We now outline the calculation of R * . Further details are given by Ball et al. (2010).
First note that, since the degree and household size of an individual are independent, the probability that a typical globally infected individual resides in a household of size n is given byρ n = μ −1 H nρ n (n = 1, 2, . . .). (An individual chosen uniformly at random from the population is n times as likely to reside in a given household of size n than in a given household of size 1.) Thus, whereC (n) is the number of global infections emanating from a typical size-n singlehousehold epidemic initiated by a single infective who is infected through the global network. Consider such a size-n single-household epidemic. Label the household members 0, 1, . . . , n − 1, where 0 is the initial infective, and writẽ where χ i = 1 if individual i is infected by the single-household epidemic, otherwise χ i = 0, and C i is the number of global infections made by individual i (assuming it is infected). (Throughout the paper we adopt the usual convention that empty sums are equal to 0, so, for example, χ i be the final size of the single-household epidemic, not including the initial infective, and μ (n) Whether or not a given individual, i say, is infected by the single-household epidemic is independent of its infectious period, so χ i and C i are independent. Thus, taking expectations of (2.2) and exploiting symmetries yields (2.3) Since the probability that a Poisson process of rate λ G has no points in a time of length I is e −λ G I we have that the probability an infected individual infects a given global neighbour is We also have that the number of uninfected global neighbours of individual 0 is distributed according toD − 1.
since the number of uninfected global neighbours of individual 1 is distributed according to D. Substituting these results into (2.3) and then into (2.1) yields (2.4) Let p maj be the probability that a major outbreak occurs. Then standard branching process theory [e.g. Haccou et al. (2005, Theorem 5.2)], shows that p maj = 1− f C (σ ), where σ is the smallest solution of fC (s) = s in [0, 1]. Note that, analagous to (2.1), f C (s) = ∞ i=1ρ n f C (n) (s) and fC (s) = ∞ n=1ρ n fC (n) (s). The PGFs f C (n) and fC (n) can be calculated using the methods described in Appendices B.1 and B.2. As noted by Ball et al. (2010, Sect. 3.2), these calculations are much simpler in the case where the infectious period is constant, since in this case an infected individual infects its global neighbours independently.

Final outcome of major outbreak
We now consider the fraction of the population that are ultimately infected by a major outbreak. The key tool we use is the susceptibility set (Ball 2000;Ball and Lyne 2001;Ball and Neal 2002), which we now describe. Let N = {1, 2, . . . , N } denote the entire population of N individuals. For each i ∈ N , by sampling from the infectious period distribution and then the relevant Poisson processes for local and global contacts, draw up a (random) list of individuals i would make infectious contact with if it was to become infected. Then construct a random directed graph on N , in which for any pair, (i, j) say, of individuals there is an arc from i to j if and only if j is in i's list. The susceptibility set of a given individual, i say, consists of those individuals from which there is a chain of arcs to i in the graph (including i itself). Note that any given individual is ultimately infected by the epidemic if and only if the initial infective belongs to its susceptibility set.
Similarly to the early stages of an epidemic, we can approximate the susceptibility set of an individual, i say, chosen uniformly at random from the population, by a households-based branching process. We first consider i's local susceptibility set, i.e. the susceptibility set obtained when only local (within-household) contacts are considered. Suppose that this local susceptibility set has size M (n) +1, where n denotes the size of i's household; so M (n) is the size of the local susceptibility set excluding i. (The probability mass function of M (n) is given by Eq. (B.23) in Appendix B.5.) Let B be the number of individuals, who are not in i's household, that in the random directed graph have an edge leading directly to one of the M (n) + 1 individuals in i's local susceptibility set. Each individual in i's local susceptibility set has global degree distributed independently according to D, and, as m → ∞, each global neighbour of i's local susceptibility set enters i's susceptibility set independently with probability p G . Thus, taking expectations with respect to i's household size, (2.6) The above B individuals form the first generation of our approximating branching process. We next consider each of these B individuals in turn, construct the local susceptibility sets in their respective households (which are distinct with probability tending to one as m → ∞) and then examine the global neighbours of these local susceptibility sets to obtain the second generation of the approximating branching process, and so on. Note that the initial individual in each of these B local susceptibility sets has degree distributed according toD and one of its global neighbours is already in i's susceptibility set. Thus, as above, the offspring distribution for the branching process is different for the initial individual than for all subsequent individuals. If we letB denote the offspring random variable of a typical non-initial individual, then arguing as in the derivation of (2.5), (2.7) The probability that the approximating branching process survives (i.e. does not go extinct) is given by , so z > 0 if and only if R * > 1. Moreover, if R * > 1 then z is the expected proportion of the population that is ultimately infected by a major outbreak in the limit as m → ∞; see Ball et al. (2009) for a formal proof when all the households have the same size. Furthermore, the proof of Ball et al. (2014, Theorem 3.4) can be adapted to show that, as m → ∞, the proportion of the population that is ultimately infected by a major outbreak converges in probability to z. Thus we refer to z as the relative final size of a major outbreak.

Vaccination
In modelling vaccination there are two distinct aspects that must be modelled: who gets vaccinated and what happens to those who are vaccinated. Vaccine allocation models (addressing the former issue) are our focus in this paper. We now outline the vaccine action models (addressing the latter aspect) that we allow for in our analysis.
We use a model for vaccine action, proposed by Becker and Starczak (1998), in which the vaccine response of an individual who is vaccinated is described by a random vector (A, B) which takes values in [0, ∞) 2 . Here A denotes the relative susceptibility (compared to an unvaccinated individual) and B the relative infectivity if the vaccinated individual becomes infected. Thus all Poisson processes concerning potential infection of the individual have their rates multiplied by A and the Poisson processes governing the contacts the individual makes, if infected, have their rates multiplied by B. The vaccine responses of distinct vaccinees are assumed to be mutually independent. Within this framework we consider two special cases, the all-or-nothing and the non-random vaccine responses, where (A, B) is supported on 2 and 1 points respectively. Our methods extend to vaccine action models where (A, B) is supported on finitely many points; the extension is straightforward analytically but quickly becomes numerically cumbersome if there are more than a few possible outcomes of (A, B).
The all-or-nothing model (see e.g. Halloran et al. 1992) is obtained by setting P((A, B) = (0, 0)) = 1 − P((A, B) = (1, 1)) = ε, so vaccinated individuals are rendered completely immune with probability ε, otherwise the vaccine has no effect. The non-random model assumes that P((A, B) = (a, b)) = 1, for some (a, b), so all vaccinated individuals respond identically (see e.g. Ball and Lyne 2006). An important special case is the leaky model (see e.g. Halloran et al. 1992), when b = 1, so vaccination does not affect an individual's ability to transmit the disease if they become infected. Setting ε = 1 in the all-or-nothing model or a = b = 0 in the nonrandom model yields a perfect vaccine response; that is one in which all vaccinated individuals are rendered completely immune.

Introduction
In this section we consider vaccine allocation strategies based on household size and determine their impact on R * , p maj and z under the all-or-nothing and non-random vaccine action models. For n = 1, 2, . . . and v = 0, 1, . . . , n, let x nv denote the proportion of households of size n that have v members vaccinated. Let p V denote the proportion of the population that is vaccinated, i.e. the vaccination coverage. Then p V is also the probability that an individual chosen uniformly at random from the population is vaccinated. Conditioning on the size of such an individual's household yields We derive results for an arbitrary but specified vaccine allocation however in the numerical studies we consider four allocation schemes: uniformly chosen households, uniformly chosen individuals, 'best' and 'worst'. In the uniformly chosen households scheme, households are chosen uniformly at random and all of their members are vaccinated. Thus, if the vaccination coverage is p V , In the uniformly chosen individuals scheme, individuals are chosen uniformly at random and vaccinated, so 2, . . .; v = 0, 1, . . . , n). The best and worst schemes are the allocations that make R * respectively as small as possible and as large as possible, for a given vaccination coverage.

All-or-nothing vaccine action
To analyse the consequences of a vaccination scheme using an all-or-nothing vaccine action model it is convenient to use the concept of a potential infectious global contact. Consider a given infected individual, i say, and a given global neighbour j of i. Then j is a potential infectious global contact of i if it is in i's list of individuals it makes infectious contact with (see the start of Sect. 2.2.2). The potential infectious global contact becomes an actual infectious global contact if j is unvaccinated or is vaccinated but the vaccination fails.
The early stages of an epidemic with vaccination are approximated by a branching process of (potentially) infected households, and the offspring of a given individual in the branching process are all households with which members of the single-household epidemic in the parent household make a potential global infectious contact. As in the case without vaccination, the offspring distribution of this branching process is usually different in the initial generation from all subsequent generations. LetC denote the offspring random variable for a non-initial individual. Conditioning first on the size of the corresponding household and then on the number of people vaccinated in that household yields, in an obvious notation, (3.2) To determine fC n,v (s), consider a household in state (n, v), i.e. of size n having v members vaccinated. For k = 0, 1, . . . , v, the probability that k vaccinations are successful is v k ε k (1 − ε) v−k , and given that k vaccinations are successful, the probability that the initial potentially contacted individual in that household is susceptible and thus triggers a local epidemic is n−k n . Moreover, if such a local epidemic is triggered, the number of potential infectious global contacts emanating from the local epidemic is distributed asC (n−k) , whereC (n) is as in Eq. (2.1). Thus, The distribution of the offspring random variable, C say, for the initial generation depends on how the initial infective is chosen. For n = 1, 2, . . . and v = 0, 1, . . . , n, let p V n,v be the probability that a vaccinated individual chosen uniformly at random resides in a household in state (n, v) and let p U n,v be the corresponding probability for an unvaccinated individual. Then Thus, if the epidemic is started by an individual chosen uniformly at random from all unvaccinated individuals being infected, then C is distributed as C U , where Alternatively, if the epidemic is started by choosing a vaccinated individual uniformly at random, who triggers an outbreak only if its vaccination fails, then C is distributed as The probability of a major outbreak may now be calculated as at the end of Sect. 2.2.1. Again, the formulae simplify appreciably if the infectious period is constant.
A post-vaccination threshold parameter is given by . Differentiating (3.2) and (3.3), or a direct calculation, yields As with the case of no vaccination, we can determine the relative final size of a major outbreak by considering a households-based branching process that approximates the susceptibility set of a typical individual. As with the forward process, it is convenient to consider potential global neighbours when constructing this branching process. Thus we start with an individual, i * say, chosen uniformly at random from the population, construct its local susceptibility set, taking the vaccine status of individuals in the household into account, then determine which global neighbours of individuals in this local susceptibility set would enter the susceptibility set of i * if they were susceptible (i.e. unvaccinated or unsuccessfully vaccinated). These individuals correspond to the first generation of the approximating backward branching process. Suppose that there are B such individuals. Next we take each of these B individuals in turn, first determine whether they really do enter the susceptibility set of i * (this happens with probability 1 if the individual is unvaccinated and with probability 1 − ε if it is vaccinated, independently for distinct individuals) and if they do enter the susceptibility set of i * , determine the number of potential global neighbours of its local susceptibility set to obtain its offspring in the branching process, and so on.
LetB be the offspring random variable for any non-initial individual in this backward branching process. Conditioning first on the state (n, v) of that individual's household and then on whether it joins the susceptibility set of i * , we obtain and fB (n) is as in Eq. (2.7). The distribution of B depends on how the initial individual i * is chosen. If i * is chosen uniformly at random from all unvaccinated individuals, then B is distributed as B U , say, and conditioning on the state (n, v) of i * 's household yields Then the proportion of unvaccinated individuals that are ultimately infected by a major outbreak is

Non-random vaccine action
Analysing the consequences of a vaccination scheme using a non-random vaccine action model is more difficult than with an all-or-nothing vaccine action model since disease spread is now genuinely two type. Thus we now consider two types of individual: type-U , unvaccinated, and type-V , vaccinated. The early stages of the epidemic can again be approximated by a branching process of infected households. This is now a two-type branching process, with the type of an infected household being given by the type of the initial case in that household. The offspring of a given individual in the branching process correspond to the households that are contacted globally by members of that individual's corresponding single-household epidemic in the epidemic process.
Let C U = (C UU , C U V ) denote the offspring random variable for the initial individual in the above branching process, given that individual is of type U , and let be the corresponding offspring random variable when the initial individual has type V . Thus, for example, C UU and C U V are respectively the number of unvaccinated and vaccinated individuals globally infected by the initially infected household, given that the first infective in that household is unvaccinated. Definẽ ] and define f C V (s), fCU (s) and fCV (s) similarly. (Here and henceforth, for any discrete random vector X we denote its joint PGF by f X .) Recall that when the network of global contacts is formed half-edges are paired uniformly at random. It follows that, whereC A n,v is a random vector giving the numbers of unvaccinated and vaccinated global infections that emanate from a non-initial infected household of size n, having v members vaccinated, whose primary case is of type A. The distributions of C U and C V depend on how the initial infective is chosen. If it is chosen uniformly at random from all individuals of the appropriate type in the population, then, for A ∈ {U, V }, The post-vaccination threshold parameter R v is given by the dominant eigenvalue (a real, positive eigenvalue of maximum modulus) ofM. It is well known (e.g. Haccou et al. 2005, p. 123) that, providedm U Vm V U = 0, the two-type branching process has non-zero probability of surviving if and only if R v > 1. Moreover, ifm U Vm V U = 0 and R v > 1, then the survival probability (and hence the probability of a major outbreak) can be determined as follows. Let Then, if the epidemic is started by an unvaccinated individual chosen uniformly at random from the population becoming infected, the probability of a major outbreak is p U maj = 1 − f C U (σ ). The corresponding probability when the initial infective is a vaccinated individual is , which is rather involved unless the infectious period is constant, requires the methods described in Appendices B.1 and B.3. Calculation ofM, which is sufficient for determining optimal vaccination strategies, is simpler and we now outline it.
Consider first a local (single-household) epidemic in a household in state (n, v), initiated by one of the household members becoming infected. Let T denote the number of unvaccinated and vaccinated individuals ultimately infected by this local epidemic, not including the initial case.
is the probability that an unvaccinated infective infects a given vaccinated global neighbour. Then recalling that φ I is the moment generating function of the generic infectious period random variable I . Conditioning on the state (n, v) of the infected household shows that, for A A is defined analogously tom A A but for a household in state (n, v). Further, arguing as in the derivations of (2.3) and (2.4), yields then (3.8) and (3.9) yieldM (3.10) Turning to the relative final size of a major outbreak, we consider a householdsbased branching process that approximates the susceptibility set of a given individual. This is now a two-type branching process, with type (U or V ) corresponding to the type of the primary member of the corresponding local susceptibility set. Define the offspring random variables for this branching process in the obvious fashion (cf. the forward process offspring random variables C U , C V ,C U andC V and the notation used in Sect. 3.2). We determine first the PGFs fBU (s) and fBV (s). First note that, for A ∈ {U, V }, conditioning on the state (n, v) of a household yields, in obvious notation, .
AU and M (n,v) AV are respectively the numbers of unvaccinated and vaccinated individuals, not including i * itself, in the local susceptibility set of a typical type-A individual, i * say, that resides in a household in state (n, v).

the contributions toB
A n,v from the primary individual i * , the ith secondary unvaccinated member and the jth secondary vaccinated member of i * 's local susceptibility set, respectively. Let D i * denote the number of neighbours i * has in the global network, so D i * ∼D, and recall that one of these global neighbours is used when i * joins the susceptibility set process. Thus, where χ A k = (1, 0) if the kth global neighbour of i * is unvaccinated and joins the susceptibility set process, χ A k = (0, 1) if this neighbour is vaccinated and joins the susceptibility set process, and χ A k = (0, 0) otherwise. Note that, independently, each such global neighbour is vaccinated with probability p V , and it joins the susceptibility set process with probability p NR A similar argument shows that fBAU Exploiting the mutual independence of all the random quantities in (3.11) except the components of M Calculation of the mass function of M (n,v) A is described in Appendix B.5. The distributions of B U and B V depend on how the initial individual for the susceptibility set process is chosen. For A ∈ {U, V }, if this initial individual is chosen uniformly at random from all type-A individuals in the population, then Then the proportions of unvaccinated and vaccinated individuals that are infected by a major epidemic are given by respectively. Thus the overall proportion of individuals infected by a major epidemic

Optimal vaccination strategies
A main aim of a vaccination scheme is to reduce the threshold parameter R * to below one, i.e. to make R v ≤ 1, and thus prevent a major outbreak occurring. The vaccine response may be such that R v > 1 even if the entire population is vaccinated, in which case vaccination by itself is insufficient to be sure of preventing a major outbreak. However, if R * > 1 and it is possible to make R v ≤ 1 then it is of interest to determine the allocation of vaccines that reduces R v to 1 with the minimum vaccination coverage p V . Suppose that the population has a maximum household size n max < ∞. Then p V is a linear function of x nv (n = 1, 2, . . ., n max ; v = 0, 1, . . . , n) (recall (3.1)), as is R v when the vaccine action is all-or-nothing (recall (3.5)). Thus in this case determining the allocation of vaccines that (i) minimises p V subject to R v ≤ 1 or (ii) minimises R v subject to an upper bound on p V are both linear programming problems. Moreover, as we outline below, the method of Lyne (2002, 2006) can be used to construct explicitly the solutions of these linear programming problems. The situation is in general more complicated if the vaccine action is non-random, since then R v is the dominant eigenvalue of the matrixM, and the corresponding optimisation problems are non-linear. However, the problem is linear if rank(M) = 1, a sufficient condition for which is rank(P NR . Note that rank(P NR G ) = 1 if either a = 1 or b = 1, so a leaky vaccine response satisfies this constraint.
Consider the non-random vaccine response and suppose that rank(M) = 1. Then R v = trace(M) and, recalling (3.10), it follows using (3.4), (3.8) and (3.9) that Observe that, when rank(M) = 1, R v takes the same form as for the all-or-nothing vaccine response; compare (3.12) and (3.5).
To characterise the optimal vaccination schemes in these cases, it is convenient to consider a finite population of m households, with maximum household size n max . Let m n = mρ n be the number of households of size n and let h nv = m n x nv be the number of households in state (n, v). Thenρ n = nm n /N , where N is the total population size, and, writing μ n,v for μ AoN n,v or μ NR n,v , as appropriate, (3.5) or (3.12) implies that where M n,v = nμ n,v /N , and (3.1) yields Observe that (3.13) implies that R v is obtained by summing M n,v over all households in the population. For n = 1, 2, . . . , n max and v = 0, 1, . . . , n − 1, let G n,v = M n,v − M n,v+1 be the reduction in R v obtained by vaccinating one further individual in a household in state (n, v). If G n,v is decreasing in v for each fixed n (so successive vaccinations in the same household have diminishing returns), then it is straightforward to determine optimal vaccination schemes Lyne 2002, 2006). One simply orders the states (n, v) according to decreasing G n,v , and then uses this ordering to determine the order in which individuals in the population are vaccinated, stopping the process when either the vaccination coverage reaches the desired level or when R v ≤ 1, depending on the optimisation problem under consideration. (The 'worst' scheme is obtained by vaccinating whole households in increasing order of M n,0 − M n,n .) If for some n, say n = n , G n ,v is not decreasing in v, then only those states, (n , v ) say, on the lower edge of the convex hull of . . , n − 1} can be part of an optimal vaccination scheme. It is still possible to give explicit solutions of associated optimisation problems (cf. Ball et al. 2004), and of 'worst' schemes, but the details are more involved.

Introduction
The acquaintance vaccine allocation model proposed by Cohen et al. (2003) and further analysed by Britton et al. (2007), both in the setting of a standard network model (i.e. a population modelled by a configuration model, without household structure), is as follows. Each individual in the population is sampled independently a Poisson distributed number of times, with mean κ > 0, and each time an individual is sampled it chooses one of its neighbours uniformly at random, with replacement, and that neighbour is vaccinated. If a sampled individual has no neighbours then that sampling is ignored. Individuals are vaccinated at most once, even if they are chosen to be vaccinated more than once.
If the vaccine is perfect then the early stages and final outcome of an epidemic with this acquaintance vaccination model can be analysed relatively easily using branching process approximations. Essentially this is because the epidemic involves only unvaccinated individuals, and the degrees of the neighbours of an unvaccinated individual are mutually independent. However, if the vaccine is imperfect, then the epidemic may also involve vaccinated individuals and the degrees of the neighbours of a vaccinated individual are dependent. (A low-degree neighbour of a given individual, A say, is more likely to nominate A for vaccination than a high-degree neighbour but, if A is vaccinated, at least one of A's neighbours nominated A for vaccination.) It follows that the independence property required for a branching process approximation breaks down. This dependence can be overcome if individuals are also typed by their degree but, unless the support of D is small, the calculations become computationally prohibitively expensive. Indeed, infinite-type branching processes are required if the support of D is countably infinite. For these reasons, Ball and Sirl (2013) introduced an alternative acquaintance vaccine allocation model and analysed it, in the setting of a standard network model (without households). We now extend this analysis to the present network and households model.

Acquaintance vaccination model
We assume that each individual is sampled independently with probability p S and then each sampled individual nominates each of its global neighbours independently with probability p N . All individuals that are nominated at least once are then vaccinated. Thus individuals are sampled only once and it is easily seen that the degrees of the neighbours of both vaccinated and unvaccinated individuals are mutually independent, thus facilitating branching process approximations which do not involve typing by degree. Note that the probability that an individual is not named by a given neighbour is 1 − p S p N , so the probability it is vaccinated is which, of course, also gives the vaccination coverage. We now describe the methods used to find threshold parameter, the major outbreak probability and the relative final size of a major outbreak. All details of the calculations are deferred to Appendix A.

Early stages of the epidemic
We approximate the early stages of the epidemic, with vaccination, by a multitype (forward) branching process of infected households, in which households are typed by the type of their primary (globally contacted) case. Such primary cases are typed by (i) whether they are named (N ), vaccinated (V ) or unvaccinated (U ) and (ii) whether or not they are sampled and thus might name their neighbours for vaccination (S and S c ). Here N means that a primary case was named by its global infector and therefore is vaccinated, V means that it is not named by its global infector but it is nevertheless vaccinated (i.e. it is named by another neighbour) and U means that it is unvaccinated (i.e. not named by any of its neighbours). Thus there are 6 types of infected households, which for notational convenience we give numerical indices as shown in Table 1. Secondary infected cases in a household, and also the primary case in the initially infected household, need only to be typed V or U , according to whether or not they are vaccinated. This forward branching process is in the same spirit as that described in Sect. 3.3 for household-based vaccination with a non-random vaccine action model, except to maintain the independence required for a branching process there are now 6 rather than 2 types (and an additional 2 types that appear only in the initial generation). The offspring random variables for this forward branching process are as follows. For i = 1, 2, . . . , 6, letC i = (C i1 ,C i2 , . . . ,C i6 ) denote the offspring random variable for a type-i non-initial individual in the forward branching process. ThusC i j is the number of typej primary household cases emanating from a typical single household epidemic that is initiated by a single type-i primary case. Similarly, for A ∈ {U, V }, let C A = (C A1 , C A2 , . . . , C A6 ) denote the offspring random variable for a type-A individual in the initial generation of the forward branching process. In contrast to similar analyses of previous models in this paper, when analysing the all-or-nothing vaccine response, we consider actual, rather than potential, global infections. This facilitates simultaneous analysis of the two vaccine action models to a much larger extent than is otherwise possible, but has the consequence that the formulae for p maj and z below are slightly different for the different models of vaccine action.
A threshold parameter is then given by R v , the dominant eigenvalue of the matrix M = [m i j ], wherem i j = E[C i j ]. We assume thatM is positively regular, i.e. that for some n > 0 the matrixM n has all of its entries strictly positive (e.g. Mode 1971, Sect. 1.6), which will be satisfied in all practical situations. If R v > 1 then a major outbreak occurs with strictly positive probability; otherwise the probability of a major outbreak is zero.
The probability of a major outbreak can be written in terms of the joint PGFs fC i and f C A of the random variablesC i (i = 1, 2, . . . , 6) and C A (A ∈ {U, V }). Let p U maj be the probability of a major outbreak if the initial infective is unvaccinated and p V maj be the corresponding probability when the initial infective is vaccinated. With an all-or-nothing vaccine this latter probability also assumes that the vaccination of the initial case is unsuccessful, since the branching process considers actual global infections. Conditioning on a single initial infective of type A ∈ {U, V }, we have 1, 2, . . . , 6) in [0, 1) 6 (cf. (3.6)). The probability of a major outbreak with an initial infective chosen uniformly from the whole population is therefore all-or-nothing vaccine.
(With an all-or-nothing vaccine, if the initial infective is vaccinated then a major outbreak can occur only if that vaccination is unsuccessful, which occurs with probability 1 − ε.)

Final outcome of a major outbreak
We approximate the susceptibility set of a given individual, i * say, by a householdsbased (backward) multitype branching process, where, as in Sect. 3.3, the type of a household is given by the type of the primary member of the corresponding local susceptibility set. As previously, the fraction of individuals that are ultimately infected by a major outbreak is given by the survival probability of this branching process. In the initial generation there are two types, V and U , depending on whether or not i * is vaccinated. In subsequent generations, there are 6 types, numbered 1-6 as above, where now, for example, N means that the primary case was named by the individual that it contacts globally to join i * 's susceptibility set. Similar to before, secondary members of a local susceptibility set need only to be typed V or U . Let B U = (B U 1 , B U 2 , . . . , B U 6 ) and B V = (B V 1 , B V 2 , . . . , B V 6 ) denote the offspring random variables for the initial generation of this branching process, and letB i = (B i1 ,B i2 , . . . ,B i6 ) (i = 1, 2, . . . , 6) be the offspring random variables for all subsequent generations. Suppose that R v > 1, so a major outbreak is possible. The asymptotic proportions of unvaccinated and vaccinated individuals that are infected by a major outbreak are given by 1, 2, . . . , 6). The overall proportion of the population infected by a major outbreak is therefore all-or-nothing vaccine.

Model exploration
In this section we explore the behaviour of the model and its dependence on some key parameters. The main focus is on numerical comparison of the various vaccine allocation regimes for a variety of household size and network degree distributions. First, however, we comment briefly on the relationship between outcomes in simulated finite populations and our asymptotic analytical results, and also give some discussion of the choice of the parameters p S and p N in acquaintance vaccination.

Outcomes in finite populations
One can compare the analytical, asymptotic, quantities of interest ( p maj and z) to simulation-based estimates of the corresponding quantities on finite populations. We find [consistent with Ball et al. (2009, Sect. 5)] that the agreement becomes reasonable with the number of households m in the low hundreds and very close indeed for m over about 1000, though the convergence is a little slower for heavier tailed degree distributions.

Acquaintance vaccination dependence on p S and p N
The vaccine coverage c where the function η depends on p S and p N only through their product p = p S p N . It is immediate that R v is increasing in p N for fixed p . Thus we have that the best effect is achieved by taking ( p S , p N ) = (1, p ) and the worst with ( p S , p N ) = ( p , 1). The corresponding values of Precisely as in the network-only model without households, we note that this difference is generally quite small and is at its largest when p is large, so there is high coverage and the epidemic is likely to be subcritical in any event.
When the vaccine action is not perfect we cannot make analytical progress in this direction, but extensive numerical calculations suggest that R v , p maj and z are usually monotonic in p N when p S p N is fixed and that in any case the difference in outcomes between the best and worst choices are very small. We do not explicitly demonstrate this here, but we note that (as might be expected) this is the same as Ball and Sirl (2013) found for the model without households. It can be seen to some extent in all of the plots in Sect. 5.3, where the two acquaintance vaccination plots are barely distinguishable in most cases.
In the numerical studies below we use the terms 'best' and 'worst' to refer to the acquaintance vaccination schemes with ( p S , p N ) = (1, p ) and ( p S , p N ) = ( p , 1) even though the names are not necessarily correct. As noted above, the precise choice of p S and p N seems to have only a very weak influence on the final outcome of the epidemic model and these two cases seem to give bounds for the quantities of interest.

Effect of different vaccine allocation regimes
We now explore the relative effectiveness of the different vaccine allocation regimes discussed in this paper, across a small but representative variety of household size and degree distributions. In the following figures we plot an outcome (R v or z) against the vaccine coverage c for several vaccine allocation strategies; with all other parameters held fixed. Using different plots within the same figure we vary some of these other parameters. The vaccine allocation methods we consider are (i) uniformly chosen individuals (Ind UAR), (ii) uniformly chosen households (HH UAR), (iii & iv) best and worst household-based allocation (HH Best and HH Worst), (v & vi) 'best' and 'worst' acquaintance vaccination (Acq Best and Acq Worst); see the end of Sect. 3.1 for descriptions of the household-based vaccination schemes.
In this numerical section we use household size distributions ρ UK = (31, 32, 16, 14, 5, 2)/100 and ρ Pak = (9, 45,77,118,141,146,123,104,237)/1000, which have respective means 2.4 and 6.8, as realistic household size distributions from the UK (Office for National Statistics 2011, Table 3.1) and Pakistan (National Institute of Population Studies 2013, Table 2.9), respectively. The degree distributions we use are the standard Poisson distribution and a power law distribution with exponential cutoff (see, for example, Amaral et al. 2000). For this latter distribution we use the notation D ∼ PowC(a, κ) to denote that p k ∝ k −a e −k/κ (k = 1, 2, . . . ), i.e. a power law with index a and exponential cutoff at about κ. In particular, we use the degree distribution D ∼ PowC(2, 120), which has mean μ D ≈ 3.001 and σ 2 D ≈ 66. We also use the notation Gam(k, r ) to denote a Gamma distribution with shape parameter k and scale parameter r (and thus mean kr and variance kr 2 ).
First we look at the relative effectiveness of the various allocation schemes, and how this changes with the household size distribution and the network degree distribution. Figure 1 shows plots of the post-vaccination threshold parameter R v as a function of the vaccine coverage c for the 6 vaccine allocation schemes with a perfect vaccine. The different plots use different combinations of household size and network degree distribution, with all other parameters kept fixed (full details are in the figure caption). Firstly we note that the patterns of the household-based allocation schemes are consistent with those of Ball and Lyne (2006, Sect. 4) for the standard households model. We see that when the network degree distribution is not very variable (Poisson) good household-based schemes perform similarly to acquaintance vaccination. On the other hand, when the network degree distribution is much more variable (cut-off power law) we see that acquaintance vaccination significantly outperforms the household-based schemes, though to a slightly lesser extent when the household size distribution is also more variable. Figures 2 and 3 are similar to Fig. 1, but we consider imperfect vaccine action models (with the same efficacy 1−E[AB] = 0.7) and we plot both the post-vaccination threshold parameter R v and expected final size of a large outbreak z. In Fig. 2 we use an all-or-nothing vaccine action model with success probability ε = 0.7 and consider two network degree distributions, with fixed (less variable) household size distribution. In Fig. 3 we use a non-random vaccine action model with relative susceptibility and infectivity a = 0.5 and b = 0.6, respectively, and vary the household size distibution, with a fixed (more variable) network degree distribution. Note that for this choice of a  Fig. 1 Plots of R v versus coverage (c) for various kinds of allocation of a perfect vaccine. We use two different household size distributions: left plots ρ = ρ UK , right plots ρ = ρ Pak ; and two different degree distributions: upper plots D ∼ Poi(5), lower plots D ∼ PowC(2, 120). Other parameters are I ∼ Gam(5, 1/5), λ L = 1, λ G = 0.3 and b, determining the 'best' and 'worst' household-based vaccine allocation schemes are not linear programming problems (see Sect. 3.4). In our numerical routine we use the MATLAB constrained optimisation solver fmincon.
We see in the upper plots (of R v ) in Figs. 2 and 3 broadly similar patterns to those in Fig. 1. The lower plots in Fig. 2 show the ordering of z for the different allocation regimes being the same as the ordering of R v . These two lower plots look qualitatively quite different, but it is important to note that in one the vaccine can and in the other the vaccine cannot bring the epidemic below threshold. These plots suggest that, as might be expected, when the network degree distribution is not so variable (e.g. the Poisson case) then vaccine allocation should be focussed on households-based methods, whilst when the network degree distribution is more variable (e.g. the cutoff power law) then targeting vaccination effort based on the network might give better results. Precisely which method is preferable will of course depend heavily on the other parameters of the model, but we have demonstrated that either allocation method, household-based or acquaintance-based, can be preferable to the other.  The lower plots of Fig. 3 follow similar patterns in that we are considering the case of a quite variable network degree distribution so acquaintance vaccination outperforms the household-based methods. There are however some unexpected patterns in the lower-right plot of Fig. 3, in that the ordering of the various household-based allocation regimes are not the same as in the corresponding plot of R v immediately above it. In particular, for lower coverages the 'worst' households based allocation outperforms, in terms of z, the 'best'! This demonstrates that when the epidemic is well above threshold, optimising vaccine allocation based on R * does not necessarily result in an expected final size that is as low as possible; cf. Keeling and Ross (2015), who observe a similar phenomenon in the standard households model with a perfect vaccine. The threshold parameter R v measures household-to-household transmission, but does not directly take into account the size of the within-household outbreaks. This could perhaps be resolved by optimising an individual-based threshold parameter instead (see Pellis et al. 2012;Ball et al. 2016), but the optimisation problem would be more difficult than the one we have considered. We also note that this phenomenon appears to arise only when the epidemic is well above criticality.  Lastly we note that the behaviour of p maj is broadly similar to that of z. We do not present any plots, but they have shapes and patterns similar to those in the z plots that are shown, including the unexpected ordering observed in the lower-right plot of Fig. 3.

Concluding remarks
In this paper we have analysed vaccine allocation strategies in stochastic SIR epidemic models upon populations with household and random network (of configuration model type) structure. By exploiting branching process approximations we derive asymptotic results describing the threshold behaviour of this epidemic model when there are few initial infectives and the final outcome in the event of a major outbreak. Particular attention has been paid to the analysis of acquaintance vaccination, which aims to target vaccination at individuals who are highly connected in the network. We find that acquaintance vaccination potentially offers substantial benefits over other (households-based) vaccine allocation regimes.
Whilst we have shown that acquaintance vaccination is potentially useful, it is clearly not feasible to implement in practice in human populations, so investigation of a more ethically acceptable allocation regime that preserves the targeting of wellconnected individuals is a clear direction for future work. It is also likely that the effectiveness of acquaintance vaccination will depend on the amount of clustering in the network in an interesting way; we have touched on this through the use of different household size distributions, but clearly there is scope for considerably more work in this direction. Further issues that warrant more investigation include (i) comparison to the optimal network-based vaccine allocation, assuming knowledge of the degree of every individual in the network (Ball and Sirl 2012, Sect. 5 and Appendix B), (ii) the determination of optimal or near-optimal vaccination strategies based on household and network information and (iii) extending acquaintance vaccination to include the possiblity of naming individuals who are in the same household.
A particularly striking feature of the numerical study is the fact that, when vaccine coverage is insufficient to prevent a major outbreak, the ordering of the performance of the household-based allocation strategies can be different depending on whether R v (the post-vaccination R * ) or z is used as the measure of performance. We emphasise that this does not imply that minimising R v is not worthwhile. When using householdbased vaccine allocation with a limited quantity of vaccine (i.e. a maximum vaccination coverage, c say), a natural approach is first to minimise R v among all strategies having coverage c. If the minimised value of R v is no greater than 1 then major outbreaks can be prevented and R v can be used to determine the vaccine allocation that prevents a major outbreak with minimum vaccination coverage. If the minimised value of R v is not less than 1 then households-based vaccination cannot prevent major outbreaks with the available quantity of vaccine. It then makes sense to minimise the expected cost of an outbreak; measured for example by p maj , z or some combination of these such as their product, which gives the probability that a typical individual is infected by the epidemic. The numerical examples in Sect. 5.3 suggest that in these circumstances minimising R v (which is a much simpler optimisation problem than, for example, minimising z) still produces a near-optimal vaccine allocation unless, post-vaccination, the epidemic is still well above threshold. However, this topic is not straightforward and clearly warrants further research. It will be investigated in more detail in a subsequent paper.
We also note in closing that there are some numerical challenges involved in implementing the methods we have presented. These are particularly relevant for the calculations relating to the forward process (i.e. calculation of p maj ), but do also apply to the backward process (calculation of z). The main issue that arises is that of slowly converging infinite sums and the resulting possibilities for numerical overflow and underflow. Writing doubly or triply-infinite sums as one infinite and one or two finite sums (for example ∞ i=0 ∞ j=0 a i, j = ∞ k=0 k l=0 a k−l,l ) helps in some regards (e.g. faster computing since there is only one truncation to have to control the error of) but hinders in others (e.g. slower computing since methods to avoid underflow and overflow errors become more involved).
Simons Foundation. The authors thank the Isaac Newton Institute for Mathematical Sciences, Cambridge, for support and hospitality during the programme Theoretical Foundations for Statistical Network Analysis, where work on this paper was undertaken. The authors also thank the Associate Editor and reviewers for their constructive comments which have significantly improved the presentation of the paper.
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Appendix A: Acquaintance vaccination branching process offspring distributions
In this appendix we consider the detail of calculating the offspring PGFs and means of the approximating branching processes for the acquaintance vaccination model described in Sect. 4.
In our analysis of the approximating branching processes (both forward and backward) it is convenient to think of the process of an individual in a household producing offspring (i.e. infecting network neighbours and thus contributing to the offspring of its household) in two phases. In the forward process the first phase involves consideration of the individual's network neighbours, excluding its infector if it is a primary case. We consider how many such neighbours there are and what type they (and therefore their household) would be if they were to be infected. We call these individuals next-generation neighbours. In the second phase we consider which of these nextgeneration neighbours are actually infected and thus become primary cases in the offspring households. The above description of the first phase applies mutatis mutandis to the backward process; here the second phase is different in that the possible infectious contacts originate from outside rather than within the household under consideration. Since the first phase is the same in both the forward and backward branching processes we first consider it separately.
In Sect. A.1 we study the next-generation neighbour distributions. In Sect. A.2 we determine the mean offspring matrix for the forward process, and hence the postvaccination threshold parameter R v ; in Sect. A.3 we consider the offspring distribution PGFs for the forward process, and hence the probability of a major outbreak; and in Sect. A.4 we determine the offspring PGFs for the backward process, and hence the relative final size of a major outbreak. The offspring distribution PGFs for the forward process are generally more complicated and some further detail of their calculation is described in Appendix B.4.

A.1 Degree and next-generation neighbour distributions
As outlined above, the first phase of individuals producing offspring (which this subsection describes) applies equally to the forward and backward processes. To simplify the exposition we use only the language of the forward process of epidemic spread. We also use 'household' and 'individual' in the population sense (as opposed to the branching process sense where an individual corresponds to a household in the epidemic model).
Recall the typing described in Sect. 4.2, in which infectious households are characterised by the type (1, 2, . . . , 6) of their primary case (see Table 1). Secondary cases in a household and the primary case in the initially infected household are typed (V or U ) according to whether they are vaccinated or unvaccinated. Recall also that nextgeneration neighbours of an infected individual are the individuals (in neighbouring households) that it may infect. In this subsection we first calculate the degree distribution of the various secondary and primary cases conditional on their type. Then we consider the next-generation neighbour distributions: the number of network neighbours, excluding its infector, of each secondary type that a typical type-i individual has (for i ∈ {1, 2, . . . , 6, U, V }).

A.1.1 Degree distributions
Let D V and D U respectively denote the degree of a typical vaccinated and unvaccinated secondary case. Then, recalling from Eq. (4.1) that (A.1) and, similarly, These follow because an unvaccinated individual has avoided being named by any of its neighbours and a vaccinated individual has not done so. Now consider an individual of one of the primary types 1, 2, . . . , 6. Note that whether or not it is sampled is independent of its degree. Thus primary cases of types 1 and 4 have the same degree distribution, as do primary cases of types 2 and 5, and primary cases of types 3 and 6. LetD N ,D V andD U denote generic random variables having these respective distributions. First note thatD N D =D, where D = denotes equality in distribution. Second, consider a typical unvaccinated primary case. It has unconditional degree distributionD but, in addition to not being named by its infector, we also know that it is not named by any of its other global neighbours. Thus is the probability that a typical unnamed neighbour of an infector is vaccinated. Similarly,

A.1.2 Next-generation neighbour distributions
For i = 1, 2, . . . , 6 we writeX i = (X i1 ,X i2 , . . . ,X i6 ) for the number of nextgeneration neighbours of a type-i primary case; i.e.X i j is the number of neighbours of a type-i individual who will become typej individuals if they are infected by the type-i primary case. For A = U, V we similarly define X A = (X A1 , X A2 , . . . , X A6 ) to be the corresponding quantities for a secondary case or the primary case in the initial household. To calculate the threshold parameter we need expressions for the meanŝ μ i j andμ A j of the elements of these random vectors and to calculate the offspring distribution PGFs we require the joint PGFs fX i and f X A .

Means
The mean E[X i j ] =μ i j can be written asμ i j =μ G ip i j , whereμ G i is the mean number of network neighbours a typical type-i primary case has in addition to its infector andp i j is the probability that a given such global neighbour is of type j (or, more accurately, the probability it will be of type j if it is infected). Similarly we write E[X A j ] =μ A j = μ G Ap A j for the secondary cases and primary case in the initial household.
Consider a type-1 (i.e. (N , S)) primary case. Its global degree is distributed asD N D = D, soμ G 1 = μD −1 . A given neighbour is named with probability p N , and, if that neighbour is unnamed, it is vaccinated with probabilityp V and otherwise unvaccinated. Further, independently, that neighbour is sampled with probability p S . So,p 11 = p N p S , The situation is similar for a type-4 (i.e. (N , S c )) individual, except a type-4 individual is not sampled and hence cannot name its global neighbours. Henceμ G 4 = μD −1 ,p 41 =p 44 = 0, Next consider a type-2 (i.e. (V, S)) primary case, i * say. Its global degree is distributed asD V , soμ G 2 = μD V −1 and, using (A.4), LetÑ S be the number of next-generation neighbours of i * that are sampled. Theñ N S ≥ 1, since i * is vaccinated but not named by its global infector, and, for k = 1, 2, . . . , d − 1, whence, using (A.4), It follows that the probability that a given next-generation neighbour of i * is sampled is pṼ S = μÑ S /μ G 2 and that, for j = 1, 2, . . . , 6,p 2 j is given byp 1 j with p S replaced by pṼ S . Similar arguments for a type-5 (i.e. (U, S c )) primary case establish thatμ G 5 = μD V −1 and, for j = 1, 2, . . . , 6,p 5 j is given byp 4 j with p S replaced by pṼ S . Now consider a type-3 (i.e. (U, S)) primary case, j * say. Its global degree is distributed asD U , soμ G 3 = μD U −1 and, using (A.2), Since j * is not vaccinated, none of its next-generation neighbours name j * , so the probability that a given such neighbour, k * say, is sampled is given by It follows that, for j = 1, 2, . . . , 6,p 3 j is given byp 1 j with p S replaced by p U S . Similarly,μ G 6 = μD U −1 and, for j = 1, 2, . . . , 6,p 6 j is given byp 4 j with p S replaced by p U S . Now consider an individual, i * say, of type U (i.e. chosen uniformly from the population and unvaccinated). Its global degree is distributed as D U so, using (A.1), its expected number of next-generation neighbours is Further, i * is sampled with probability p S , and the type of a given neighbour of i * is distributed according top 3 j if i * is sampled and according top 6 j if i * is not sampled.
Finally, consider an individual, j * say, of type V (i.e. chosen uniformly from the population and vaccinated). Let N S and N S c denote the number of next-generation neighbours of j * that are sampled and not sampled, respectively. Then arguing as in the derivation of (A.6) shows that Further, j * is sampled with probability p S . If j * is sampled then a given neighbour is named, vaccinated or unvaccinated with respective probabilities p N , (1 − p N )p V and (1 − p N )(1 −p V ); whilst if j * is not sampled these probabilities are 0,p V and 1 −p V . We therefore havep V j = p Sp1 j + (1 − p S )p 4 j .

Joint PGFs
We determine next the PGFs fX i (i = 1, 2, . . . , 6). For i = 1, 2, . . . , 6, letD(i) denote the degree of a typical type-i primary individual, who thus hasD(i)−1 next-generation neighbours. We therefore haveD(i) − 1 D = 6 j=1X i j . For i = 1, 3, 4 and 6, the types of theseD(i)−1 next-generation neighbours are chosen independently, according to (p i j , j = 1, 2, . . . , 6), so Now consider a typical type-2 primary individual. The situation is more complex here (and for type-5 individuals) because we know that at least one of its neighbours named it for vaccination and thus must be sampled. LetÑ S andÑ S c be the number of itsD(2)−1 next-generation neighbours that are sampled and unsampled, respectively, soÑ S +Ñ S c =D(2) − 1. Each of theseD(2) − 1 neighbours is named independently with probability p N and any neighbour not so named is vaccinated with probabilitỹ p V . Thus A typical type-U secondary individual, j * U say, has degree distributed according to D U [recall (A.1)], so its number of next-generation neighbours is also distributed as D U . Further, j * U is sampled with probability p S , in which case, apart from its degree, j * U behaves similarly to a type-3 primary individual, otherwise j * U is not sampled and behaves similarly to a type-6 primary individual. Thus, A typical type-V secondary individual, j * V say, is also sampled with probability p S . A similar argument to the derivation of (A.9) and (A.10), using appropriate modifications as j * V has degree distributed according to D V , yields

A.2 Threshold parameter
As far as is possible, we treat the all-or-nothing and non-random models simultaneously. As explained in the second paragraph of Sect. 4.2.2, in contrast to Sect. 3.2, when analysing the all-or-nothing vaccine response we consider actual, rather than potential, global infections. The forward branching process for both models is then similar to that used for the non-random model in Sect. 3.3, except now it is a 6-type (rather than 2-type) branching process. As always, the offspring distributions are different in the initial generation from subsequent generations, but only those for the latter are required to determine the threshold parameter. Recall that, for i = 1, 2, . . . , 6, C i = (C i1 ,C i2 , . . . ,C i6 ) denotes the offspring random variable for a type-i noninitial individual in the forward branching process. ThusC i j is the number of typej primary household cases emanating from a typical single-household epidemic that is initiated by a single type-i primary case. Since the post-vaccination threshold parameter R v is the dominant eigenvalue ofM = [m i j ], our goal is to determine these matrix To calculatem i j it is convenient to decompose it intom i j =m P i j +m S i j , wherẽ m P i j is the mean number of global infections of typej individuals emanating from the primary type-i case andm S i j is the mean total number of global infections of typej individuals emanating from all secondary cases in the single-household epidemic.

A.2.1 Infections emamating from primary cases
Define the matrix P G of marginal global transmission probabilities, so P G = P NR G [recall (3.7)] if the vaccine action is non-random and P G = P AoN G if the vaccine action is all-or-nothing, where The marginal global transmission probabilities, for the types i, j = 1, 2, . . . , 6, are then given by p G i j = p G (α(i), α( j)), where α(k) = U if k = 3, 6 and α(k) = V if k = 1, 2, 4, 5. Since each typej next-generation neighbour of a type-i infective is infected with probability p G i j it is immediate thatm P i j =μ i j p G i j , whereμ i j is the mean number of next-generation neighbours calculated in Sect. A.1.2.

A.2.2 Infections emamating from secondary cases
We first note that, for j = 1, 2, . . . , 6,m S 1 j =m S 2 j =m S 4 j =m S 5 j (=m S V j say) and m S 3 j =m S 6 j (=m S U j say). Further, for A ∈ {U, V } and j = 1, 2, . . . , 6, A ) is the mean number of type-A secondary cases in a typical single-household epidemic initiated by a primary case of type A and, for A ∈ {U, V },m A j is the mean number of typej primary cases emanating from a typical type-A secondary case. Now observe thatm A j =μ A j p G (A, α( j)), where α( j) is as in Sect. A.2.1. Conditioning on the size of a typical globally contacted household, we obtain in an obvious notation that Note that in the limit as the number of households m → ∞, secondary individuals in a typical household have no common global neighbour with probability one, so they are vaccinated independently, each with probability p V . Thus, again in an obvious notation, If the vaccine action is non-random then, recalling the notation in Sect. 3.3, μ and 0 otherwise. If the vaccine action is all-or-nothing, we condition on the number of successfully vaccinated secondary individuals in the household to obtain, for A ∈ {U, V }, (Note that μ (n−l) (λ L ) = 0 when l = n − 1.)

A.2.3 Perfect vaccine
In the situation where the vaccine is perfect, we have only households of types 3 and 6 (i.e. (U, S) and (U, S c )); and since these two types are sampled independently with the same probability, the forward process reduces to single-type. The mean number of next-generation neighbours of an infected individual is μD U −1 + μ S (U, U )μ D U . Each infected individual is sampled with probability p U S , so fails to name each nextgeneration neighbour with probability p U S (1− p N )+1− p U S . Then each such unnamed individual avoids vaccination (by other neighbours in the network) with probability 1 −p V . Thus Using (A.3), (A.7) and (A.8) this can be written as where p = p S p N and η depends on p S and p N only through p .

A.3 Probability of a major outbreak
To determine the probability of a major outbreak we need to derive the PGFs fC i (s) and f C A (s), where s = (s 1 , s 2 , . . . , s 6 ) ∈ [0, 1] 6 , for i = 1, 2, . . . , 6 and A = U, V ; i.e. the offspring PGFs for the forward process.
Recall that secondary individuals in a household are vaccinated independently, each with probability p V . Thus, conditioning first on household size and then on the number of secondary members that are vaccinated yields, for i = 1, 2, . . . , 6, is the offspring random variable for a type-i non-initial individual, given that the corresponding household is of size n and has v s secondary members vaccinated. Similarly, and in an obvious notation, for A ∈ {U, V }, With an all-or-nothing vaccine action, we may condition also on the number of vaccinations of secondary members that are unsuccessful to obtain, for i = 1, 2, . . . , 6, is the offspring random variable for a type-i non-initial individual, given that the corresponding household contains n − 1 other susceptibles, of which v are unsuccessfully vaccinated. (Note that, unlike with household-based vaccination, the vaccine status of these susceptibles is important as it affects their degree distributions.) Similarly, and in an obvious notation, for A ∈ {U, V }, are the respective contributions toB i from the primary individual, the kth unvaccinated secondary individual and the lth vaccinated secondary individual in the local susceptibility set. All summands on the right hand side of (A.11) are mutually independent and also independent of M i , so, in an obvious notation, A (s) for the two models of vaccine action is addressed in Appendix B.5. We determine next the PGFs fBP i (i = 1, 2, . . . , 6). We do so by conditioning oñ X i , the numbers of next-generation neighbours of different types of a typical type-i primary case, i * say, and then considering how many of theX i individuals of the various types actually join the susceptibility set. Each of i * 'sD(i) − 1 = 6 j=1X i j next-generation neighbours in the construction of the backward process enters the susceptibility set independently, with probability depending on the types of the two individuals involved. The probability that a typej neighbour of a type-i individual joins the susceptibility set is (The different formulae arise from the fact that in the all-or-nothing case, in addition to requiring a contact from the neighbour to the type-i individual of interest, a vaccinated neighbour is able to join the susceptibility set only if its vaccination fails.) Hence, for i = 1, 2, . . . , 6, 2, . . . , 6). The PGFs fBS U and fBS V follow using exactly the same arguments. In an obvious notation, Finally, we determine the offspring PGFs for the initial generation, f B U and f B V . Observe that, for A ∈ {U, V }, in the initial generation, a primary individual of type A behaves according to the same probability law as a typical secondary type-A individual (in any generation), so (A.12) becomes and f B A (s) can be evaluated using results given above.

Appendix B: Properties of single-household epidemics
In this appendix we first outline the theory of final state random variables as studied by Ball and O'Neill (1999). We then apply this (in Sects. B.2,B.3,B.4) to calculating the PGFs of offspring distributions of the various forward branching process approximations in the paper. In Sects. B.5 and B.6 we give results pertaining to the size of local susceptibility sets and the expected size of local epidemics. We use the following notation. The symbols Z + and R + denote the nonnegative integers and reals, respectively. For vectors x = (x 1 , x 2 , . . . , x m ) and y = (y 1 , y 2 1, 2, . . . , m) and x < y if, in addition, x i < y i for at least one i. For n, k ∈ Z + , n [k] denotes the falling factorial n(n − 1) . . . (n − k + 1), with n [0] = 1. For n, k ∈ Z m + , we define n [k] We take 0 and 1 to be vectors with all entries 0 and 1 respectively, with the dimensions being clear from the context. Finally, we let f (r ) denote the r -th order derivative of the function f .

B.1 Multitype single-household epidemic and key result
Consider a single-household SIR epidemic model with m types of individuals, labelled 1, 2, . . . , m. Suppose that, for i = 1, 2, . . . , m, there are initially a i infectives and n i susceptibles of type i, and let a = (a 1 , a 2 , . . . , a m ) and n = (n 1 , n 2 , . . . , n m ). For i = 1, 2, . . . , m, the infectious periods of type-i infectives are each distributed according to a random variable I (i) . For i, j = 1, 2, . . . , m, the individual-to-individual infection rate from a given type-i infective to a given typej susceptible is λ i j . As in Sect. 2.1, such infections are governed by Poisson processes, and all Poisson processes and infectious periods are mutually independent.
To each type-i infective we attach a vector of p non-negative integer-valued random attributes distributed according to a vector random variable . In our applications A (i) will describe the numbers of global infections of different types of individuals made by an infective in the single-household epidemic. The realisations of the random variables (I (i) , A (i) ) are independent for distinct infectives and identically distributed for infectives of the same type. Note that I (i) and A (i) may be dependent. For i = 1, 2, . . . , m, let T i be the number of initial susceptibles of type i that are ultimately infected by the epidemic and let A (i) (T i ) be the sum of the attribute vectors over all a i + T i infectives of type i. Further, let We give below an expression for φ(x, s) in terms of multivariate Gontcharoff polynomials, first studied by Lefèvre and Picard (1990), which we now define. Let U = (u j ∈ R m : j ∈ Z m + ) be a collection of real numbers. The Gontcharoff polynomials associated with U, denoted (G k (x|U), k ∈ Z m + , x ∈ R m ), are defined recursively by Note that G 0 (x|U) ≡ 1 and that G k (x|U) is a polynomial of degree k 1 , k 2 , . . . , k m in x 1 , x 2 , . . . , x m , respectively, depending only on (u j : j < k). The following result is derived easily from Ball and O'Neill (1999, Theorem 5.1), so its proof is omitted.
We now describe how Theorem 1 can be used to determine offspring PGFs for the forward branching processes in the earlier parts of the paper. Recall that primary and secondary infectives in a household typically have distinct global degree distributions. Hence, in addition to being typed according to their vaccination status, individuals also need to be typed as primary or secondary. Thus we may write m = m P + m S , where types 1, 2, . . . , m P correspond to primary individuals and types m P +1, m P +2, . . . , m to secondary individuals. Write a = (a P , a S ), n = (n P , n S ), x = (x P , x S ), T = (T P , T S ) and ψ(s, j ) = (ψ P (s, j ), ψ S (s, j )) in the obvious fashion. Note that all susceptibles are secondary individuals, so n P = 0, and all initial infectives are primary individuals, so a S = 0. It follows that T P = 0 and the index j of the summation in (B.1) takes the form (0, j S ). Let Then the following corollary follows easily from Theorem 1. For i = 1, 2, . . . , m, let λ (i) S = (λ i,m P +1 , λ i,m P +2 , . . . , λ i,m ) , where denotes transpose.
Corollary 2 For x S ∈ R m S and s ∈ [0, 1] p , whereψ P (s, j S ) = ψ 1 (s, j S ),ψ 2 (s, j S ), . . . ,ψ m P (s, j S ) andψ S (s, j S ) = ψ m P +1 (s, j S ),ψ m P +2 (s, j S ), . . . ,ψ m (s, j S ) , with We are primarily concerned with determining the PGF of A(T ), which of course is obtained by setting x S = 1 in (B.2). Thus, all that remains is to determineψ i (s, j S ), which is application dependent. Note from (B.3) that it is sufficient to determine, for θ ∈ R + and s ∈ [0, 1] p , which we now do for the various models in the paper.

B.2 No vaccination
This model is studied by Ball et al. (2010), so we just state the result. Note that there are two types of individual, primary (P) and secondary (S), with their number of uninfected global neighbours distributed according to D P ∼D − 1 and D S ∼ D, respectively. Further p = 1, so s is a scalar, s say. For a degree distribution D and real numbers c 1 , c 2 , θ, define the function F(D, c 1 , c 2 , θ) by

B.3 Households based vaccination
Recall from Sect. 3 that with an all-or-nothing vaccine action all required PGFs and means can be expressed in terms of f C (n) , fC (n) and μ n (λ L ), so here we need consider only the non-random vaccine action model. This model has four types of individuals: primary-unvaccinated ( Consider firstψ SU (θ, s). Conditioning on the degree D SU and infectious period I of a typical type-SU infective, i * say, yieldŝ Let N V denote the number of i * 's uninfected global neighbours that are vaccinated. Then A (B.7) (As usual, Bin(n, p) denotes a binomial distribution with parameters n ∈ Z + and p ∈ [0, 1].) Letâ(s) Then, using (B.5) and (B.7), ψ SU (θ, s) = E D SU ,I e −θ I â(s) +b(s)e −λ G a I +ĉ(s)e −λ G I D SU For a degree distribution D and real numbers c 1 , c 2 , c 3 , θ, a, b, define the function G (D, c 1 , c 2 , c 3 , θ, a, b) by G (D, c 1 , c 2 , c 3 , θ, a,  and thatψ PU (θ, s) is given by (B.10), with D SU replaced by D PU , andψ PV (θ, s) is given by (B.11), with D SV replaced by D PV . If a closed-form expression for f D is available then G (D, c 1 , c 2 , c 3 , θ, a, b) may be computed by using a finite truncation of (B.9). If a closed-form expression for f D is unavailable then G (D, c 1 , c 2 , c 3 , θ, a, b) may be computed by using a finite truncation of a corresponding triple sum (cf. (B.8)).

B.4 Acquaintance vaccination
This model has eight types of individuals, six primary and two secondary, which, as in Sect. 4.2, we label 1, 2, . . . , 6 and U, V , respectively. For each type i, the random attribute of interest j is the number of global infections of a type j individual made by an infective type i individual. Thus we determine, in an obvious notation,ψ i (θ, s) (i = 1, 2, . . . , 6),ψ U (θ, s) andψ V (θ, s); for θ ∈ R + and s = (s 1 , s 2 , . . . , s 6 ) ∈ [0, 1] 6 . It is convenient to treat the all-ornothing and non-random vaccine action models separately.

B.4.1 All-or-nothing vaccine action
For i = 1, 2, . . . , 6, recall thatD(i) is the global degree of a typical type-i primary individual, i * say, and also thatX i = (X i1 ,X i2 , . . . ,X i6 ), whereX i j is the number of i * 's next-generation neighbours that have type j. Recall that types 3 and 6 have not been vaccinated and types 1, 2, 4 and 5 have been vaccinated. Thus Hence, A similar argument to the derivation of (B.8), using the binomial theorem rather than the multinomial theorem, then yieldŝ ψ i (θ, s) = F(D(i) − 1, 1 −p i (s, ε),p i (s, ε), θ).
In the sequel we use, without comment, a similar notation for differences of other functions evaluated at 0 and p N .
Turning now to secondary individuals, note that a type-U individual has global degree and therefore number of next-generation neighbours distributed as D U . Such an individual is sampled with probability p S , in which case the type of each nextgeneration neighbour is distributed according top 3 j , otherwise it is unsampled and the type of each next-generation neighbour is distributed according top 6 j . Thuŝ ψ U (θ, s) = p S F D U , 1 −p 3 (s, ε),p 3 (s, ε), θ + (1 − p S )F D U , 1 −p 6 (s, ε),p 6 (s, ε), θ .  p(s, x),q(s, x), θ ). In the sequel, we use a similar notation without comment.

B.5 Local susceptibility set size
In this section we give formulae for the probability mass functions of the size of local susceptibility sets. The corresponding PGFs are then simple to calculate numerically. Consider first the model without vaccination and recall the definition of M (n) in Sect. 2.2.2. Then it follows directly from Ball (2000, Lemma 3.1) (see also Ball and Neal (2002, Lemma 3.1), which gives the same result but not in terms of Gontcharoff polynomials) that  A can be derived using the same argument as in the proof of Lemma 3.1 in Ball (2000). For brevity, we omit the details and just state the result.
For j = ( j U , j V ) ≥ 0, let q j = (q U j , q V j ), where q U j = φ I (λ L ( j U + bj V )) and q V j = φ I (aλ L ( j U + bj V )). Let 1 U = (1, 0) and 1 V = (0, 1). Then, for n ≥ 0 and A ∈ {U, V }, In the model with an all-or-nothing vaccine action we can calculate the mass function of M (n,v) A by conditioning on the number, v S say, of the v vaccinations that are successful. The mass function of M (n−v S ,v−v S ) A can then be calculated as above, but with q U j = q V j = φ I (λ L ( j U + j V )).