Heterogeneous network epidemics: real-time growth, variance and extinction of infection

Recent years have seen a large amount of interest in epidemics on networks as a way of representing the complex structure of contacts capable of spreading infections through the modern human population. The configuration model is a popular choice in theoretical studies since it combines the ability to specify the distribution of the number of contacts (degree) with analytical tractability. Here we consider the early real-time behaviour of the Markovian SIR epidemic model on a configuration model network using a multitype branching process. We find closed-form analytic expressions for the mean and variance of the number of infectious individuals as a function of time and the degree of the initially infected individual(s), and write down a system of differential equations for the probability of extinction by time t that are numerically fast compared to Monte Carlo simulation. We show that these quantities are all sensitive to the degree distribution—in particular we confirm that the mean prevalence of infection depends on the first two moments of the degree distribution and the variance in prevalence depends on the first three moments of the degree distribution. In contrast to most existing analytic approaches, the accuracy of these results does not depend on having a large number of infectious individuals, meaning that in the large population limit they would be asymptotically exact even for one initial infectious individual.


Background
Models of infectious disease transmission have, from relatively modest beginnings (e.g. Bailey 1957), developed a rich domain of applicability covering the whole spectrum of human, animal and plant pathogens, and informing the study of questions from viral evolution, through epidemiology of infectious diseases, to public health policy (see Heesterbeek et al. 2015). Increasingly, networks have been seen as a way of modelling the complex, heterogeneous patterns of contacts between individuals (Danon et al. 2011).
In theoretical studies, the configuration model has been a popular choice due to the ability to specify the number of contacts each individual has that are capable of spreading disease, while allowing for analytic results to be obtained (e.g. Molloy and Reed 1995;Newman 2002). Ball and Neal (2008) used an effective degree approach (which we describe in Sect. 2.2 below-cf. Lindquist et al. 2010) to derive a system of ordinary differential equations that describes the deterministic limit of the epidemic model as the population size N → ∞. A much simpler (equivalent) system of only 4 ordinary differential equations was obtained by Volz (2008) and subsequently shown by Miller (2011), Miller et al. (2012 to be essentially one-dimensional (the 4 ODEs were also shown by House and Keeling (2010) to be a special case of the much higher dimensional pair approximation model of Eames and Keeling (2002), in which the degree structure is explicit). Fully rigorous proofs of convergence in probability of the scaled stochastic model to the deterministic limit are given by Decreusefond et al. (2012), Bohman and Picollelli (2012), Barbour and Reinert (2013) and Janson et al. (2014). These works are primarily concerned with the temporal behaviour of proportions of the population in different epidemiological compartments (susceptible, infectious and removed) over the main body of a large epidemic. Here, we are also concerned with temporal behaviour, but focus on numbers infected early in the epidemic, including the possibility of early stochastic extinction.
In a recent paper, Graham and House (2014) use a pairwise approximation in conjunction with the central limit theorem for density dependent population processes (Ethier and Kurtz 1986, Chap. 11) to obtain a closed-form approximation to the mean and variance of prevalence in the linearised model which approximates the early asymptotic exponential growth phase of a Markovian SIR epidemic on a configuration network. In particular, they find that, under these approximations, the variance in disease prevalence is determined by the first three moments of the network degree distribution. In this paper, we use the effective degree approach of Ball and Neal (2008) to approximate the early stages of the epidemic by a continuous-time, multitype Markovian branching process, which is then analysed in detail. For t ≥ 0, let Z (t) denote the total number of individuals alive in this branching process at time t, so Z (t) approximates disease prevalence in the epidemic model during its early asymptotic exponential growth phase. Explicit closed-form expressions are derived for the mean and variance of Z (t), the covariance of Z (t) and Z (s) to give the behaviour over time, and also for the probability of extinction π(t) = P(Z (t) = 0). As in Graham and House (2014), the mean and variance in disease prevalence depends on the degree distribution only through its first two and three moments, respectively.
The results in Graham and House (2014) assume implicitly that the initial number of infectives is sufficiently large for the density dependent population process central limit theorem to yield a good approximation. In contrast, our results assume any arbitrary, but specified, initial number of infectives. The asymptotic distribution of types in the branching process, when it does not go extinct, is also available in closed-form and enables us to obtain a Gaussian process approximation, with explicit mean and covariance function, for the prevalence in the early asymptotic exponential growth phase of an SIR epidemic, with few initial infectives, which takes off and becomes established. We show that this approximation can be applied together with the methods of Ross et al. (2006) to estimate epidemiological parameters from early prevalence data of a simulated epidemic provided the first three moments of the degree distribution are known.

Outline of the paper
The paper is organised as follows. The configuration network model and a Markov SIR epidemic on that network are described in Sect. 2.1. The effective degree construction of this epidemic is outlined in Sect. 2.2. Approximation of the early stages of this epidemic by a branching process is outlined in Sect. 2.3, where conditions are given for the mean, variance and covariance functions of the number of infectives in the epidemic process to converge to the corresponding quantities of the approximating branching process as the population size tend to infinity. The representation of the approximating branching process as a continuous-time, multitype Markov branching process is outlined in Sect. 2.3 and described more explicitly in Sect. 2.4. The mean, variance and covariance functions of the total number of individuals alive in the branching process are considered in Sects. 3, 4 and 5, respectively. Explicit closed-form expressions are obtained for each of these quantities and for their limits as time t → ∞. The arguments in Sects. 3, 4 and 5 assume that underlying degree distribution has a maximum degree. In Sect. 6, we show that these expressions continue to hold in the unbounded degree setting, subject to the degree distribution satisfying suitable moment conditions. The probability that the branching process is extinct at time t is studied in Sect. 7. Closedform expressions for this probability, given the initial state of the branching process, are not available so asymptotic results as t → 0 and t → ∞ are considered.
The mean, variance and covariance functions derived in Sects. 3, 4 and 5 are unconditional, so they include realisations of the branching process which result in extinction. However, in the epidemic setting, we are often interested in analysing the behaviour of epidemics that take off and become established, which correspond to non-extinction of the branching process. In Sect. 8, we first derive the mean and variance of the total number of individuals alive in the branching process at time t, conditional upon the process having survived to time t; fully closed-form results are not available owing to the absence of a closed-form expression for the survival probability. We then consider realisations of the branching process which reach some specified size, K say, with time being set to zero the first time the total number of individuals alive is K . The results in Sect. 3 yield an explicit expression for the asymptotic distribution of types, given that the branching process does not go extinct, which, provided K is sufficiently large, enables the above branching process starting from K individuals to be approximated by a Gaussian process whose mean and covariance functions are determined explicitly. The theory is illustrated by numerical examples of both forward simulation and inference in Sect. 9 and some concluding comments are given in Sect. 10.
In general, we define notation as it is introduced; we also collect notation that is used in multiple sections in Table 1. 2 Model and approximating branching process

Model
We consider the spread of an SIR epidemic on a network of N individuals, labelled 1, 2, . . . , N , constructed using the configuration model as follows (see e.g. Newman 2002). Let D be a random variable which describes the degree of a typical individual and let p k = P(D = k) (k = 0, 1, . . .). Let D 1 , D 2 , . . . , D N be independent realisations of D and, for i = 1, 2, . . . , N , attach D i stubs (half-edges) to individual i. Pair up these stubs uniformly at random to form the edges in the network. If D 1 +D 2 +· · ·+D N is odd, there will be a left-over stub, which is ignored; the resulting network may have other 'defects' such as self-loops and multiple edges between pairs of individuals but, provided that D has finite variance, such imperfections become sparse in the network as N → ∞ (see e.g. Durrett 2007, Theorem 3.1.2). An alternative to the degrees D 1 , D 2 , . . . , D N being random is, for each N = 1, 2, · · · , to replace D = , where the degree sequences D (N ) (N = 1, 2, . . .) are prescribed and satisfy p where the Kronecker delta δ k, j is 1 if k = j and 0 otherwise (see e.g. Molloy and Reed 1995). The epidemic is defined as follows. Initially some individuals are infective and the remaining individuals are susceptible. Infective individuals have independent infectious periods, each having an exponential distribution with rate γ (and hence mean γ −1 ), after which they become recovered and play no further role in the epidemic. Throughout its infectious period, an infective contacts each of its susceptible neighbours in the network at the points of independent Poisson processes having rate τ , so the probability that a given infective contacts a given neighbour before the infective recovers is τ/(γ + τ ). Any contacted susceptible immediately becomes infective and may transmit the infection to any of its neighbouring susceptibles; i.e. there is no latent period. All the infectious periods and Poisson processes governing transmission of infection are mutually independent. The epidemic ends as soon as there is no infective present in the network. Ball and Neal (2008) introduced an 'effective-degree' construction of the above epidemic, in which the network is constructed as the epidemic progresses. The process starts with some individuals infective and the remaining individuals susceptible, but

The effective degree model
Random number of individuals of type i in the branching process at time t given initial type k π k (t) Probability that the branching process is extinct at time t given initial type k with none of the stubs paired up. For i = 1, 2, . . . , N , the effective degree of individual i is initially D i . Infected individuals transmit infection by pairing their stubs with stubs attached to susceptible individuals in the following fashion. An infected individual makes infectious contacts down its unpaired stubs independently at rate τ and is removed at rate γ . When an infective, individual i say, transmits infection down a stub that stub is paired with a stub (attached to individual j, say) chosen independently and uniformly at random from all the unpaired stubs, to form an edge. The effective degrees of individuals i and j are both reduced by 1. If i = j then the effective degree of individual i is reduced by 2 but this will not significantly affect the dynamics for large populations since the probability of it happening is O(N −1 ). If individual j is susceptible then it becomes infective and can transmit infection down any of its unattached stubs. As before, the epidemic ends as soon as there is no infective present. The network is then typically only partially constructed but that does not matter if interest is focussed on properties of the epidemic. In the original formulation of Ball and Neal (2008), when an infective recovers its unpaired stubs, if any, were paired with stubs chosen uniformly at random without replacement from the set of unpaired stubs but that is unnecessary; the stubs from such an infective can simply be left in the set of unpaired stubs.

Approximating multitype branching process
Suppose that the size N of the network is large and the initial number of infectives is small. Then during the early stages of an epidemic it is very likely that each time an infective individual transmits infection down a stub that stub is paired with a stub belonging to a susceptible individual. It follows that the early stages of such an epidemic can be approximated by a branching process in which each newly-infected individual has their "full" effective degree (i.e. their actual degree minus one for the stub that is paired with their infector). This approximation can be made fully rigorous by considering a sequence of epidemics, indexed by N , and using a coupling argument; see e.g. Ball and Neal (2008), which treats a more general model in which infective individuals also make contacts with individuals chosen uniformly at random from the population. Let E N denote the epidemic on a network of N individuals and let B denote the approximating branching process. Then following Ball and Neal (2008) is finite then the epidemics E 1 , E 2 . . . and the branching process B can be constructed on a common probability space so that, with probability one, over any finite time interval [0, t] the process of infectives in E N and the branching process B coincide for all sufficiently large N . The same result holds for the model with prescribed degree sequences provided that p (N ) As indicated in "Appendix 1", the branching process B is not an almost sure upperbound for the process of infectives in E N , so unlike in Theorem 3.1 of Ball and Donnelly (1995) which considers homogeneously mixing epidemics, one cannot simply use the dominated convergence theorem to deduce convergence of moments of the number of infectives in the epidemic process to corresponding moments of the branching process as N → ∞. If there is a maximum degree k max (i.e. p k = 0 for all k > k max , or p (N ) k = 0 for all k > k max and all N in the model with prescribed degrees) then, for all N , the process of infectives in E N is bounded above by a branching process in which each newly-infected individual has the maximun effective degree k max −1, so in that case the dominated convergence theorem can be used to prove convergence of moments. In "Appendix 1", we consider the case when there is no maximum degree and use uniform integrability arguments to determine sufficient conditions for the mean and variance of the number of infectives at any given time t ≥ 0, and the covariance of the number of infectives at any given times t, s ≥ 0, in the epidemic E N to converge to the corresponding mean, variance and covariance of the branching process B as N → ∞. Specifically, we prove that (i) in the model with prescribed degrees these moments converge if, in addition to the conditions given above, there exists δ > 0 such that μ (N ) and (ii) in the model with random degrees they converge if the momentgenerating function M D 2 (θ ) = E exp θ D 2 of D 2 is finite for some θ 0 > 0. Note that the latter condition implies that E [D α ] < ∞ for all α ≥ 0.
In this context, we note that, for the model with prescribed degrees, the weakest conditions obtained on the moments of the degree distribution for convergence of the scaled stochastic epidemic on to its deterministic limit are given by Janson et al. (2014), who require uniform boundedness of the second moment of D (N ) . However that paper, and the other related papers cited in the second paragraph of Sect. 1.1, (i) are concerned with the entire time course of the epidemic; (ii) assume that either the epidemic starts with a positive fraction of the population infected in the limit as N → ∞, or if that limiting fraction is zero then the convergence is for epidemics which take off and involves a random time translation describing when the epidemic becomes suitably established; and (iii) consider the evolution of the proportion of the population that is susceptible, infective or recovered. By contrast, this paper is concerned with epidemics initiated by few infectives and considers the number, rather than proportion, of infectives during the early phase of such an epidemic. Under the coupling mentioned above, in the limit as N → ∞, if an epidemic takes off then the duration of its early (exponentially growing) phase tends almost surely to infinity.
The limiting branching process may be described by a continuous-time multitype Markov branching process, with the type of an infective corresponding to its effective degree. LetD denote the (size-biased) degree of a typical neighbour of a typical individual in the network and letp k = P(D = k) (k = 1, 2, . . .).
, since when a stub is paired it is k times as likely to be paired with a stub from a given individual having degree k than it is with a stub from a given individual having degree 1. Under the branching process approximation, the effective degree of a newly infected individual is distributed according toD − 1, since one of that individual's stubs is 'used up' when it is infected. Note for future reference that μD = μ −1 D μ D 2 and, more generally, μ f (D) = μ −1 D μ D f (D) for any real-valued function f . (For a random variable, X say, we use μ X to denote its expectation E [X ]. Thus, for example,

Explicit form for the multitype branching process
We now assume that there is a maximum degree k max . We show in Sect. 6 that our results for moments of the branching process extend to the case of no maximum degree size, subject to suitable moment conditions on D. Thus the type space for the branching process is K = {0, 1, . . . , k max }. Note that only initial infectives can have type k max . For k ∈ K , an individual of type k dies if either its infectious period comes to an end or it transmits infection down one of its unattached stubs, whichever happens first. If the former happens first then the individual has no offspring, otherwise it has two offspring, namely an individual of type k − 1 and an individual whose type is distributed according toD − 1. Note that an individual of type 0 necessarily has no offspring when it dies. Thus, for k ∈ K , the lifetime of an individual of type k has an exponential distribution with rate (2.1) and when it dies its offspring is distributed as follows (recall thatp k max +1 = 0): (2. 2) The joint probability-generating function (PGF) for offspring of a type-k individual is therefore where s = (s k ). In general we will write v = (v k ) = (v 0 , v 1 , . . . , v k max ) for a column vector in R k max +1 , where denotes transpose. Verbally, we will call v 0 the 0th element of such a vector, v 1 the first element etc. For k ∈ K , let ∂ P k (s) be the column vector whose ith element is ∂ P k (s) ∂s i and let ∂ 2 P k (s) be the matrix whose (i, j)th element is ∂s i ∂s j . We note for future reference that, for i, j, k ∈ K , where 1 is the length-(k max + 1) column vector of ones. For t ≥ 0, let Z(t) = (Z i (t)), where Z i (t) denotes the number of individuals of type i alive at time t, and let Z (t) = Z 0 (t)+ Z 1 (t)+· · ·+ Z k max (t) = 1 Z(t) denote the total number individuals alive at time t. For k ∈ K , we use the notation i (t)), to denote a process starting with a single individual, whose type is k, at time 0 (i.e. where Z (k) denotes the total number of individuals at time t in such a process.

Behaviour of means
In the next three sections we consider the behaviour of the mean, variance and covariance function of the total number of individuals over time in the branching process which approximates the initial phase of an epidemic. For t ≥ 0 and i, j, k ∈ K , let where u k is a length-(k max + 1) column vector with kth element equal to 1 and other elements equal to 0. A standard argument using the Kolmogorov forward equation (see e.g. Dorman et al. 2004, Sect. 7 and recall thatp k max +1 = 0) then yields that where I denotes the (k max + 1) × (k max + 1) identity matrix and Ω = [Ω l,k ] is the (k max + 1) × (k max + 1) matrix with elements given by The solution to (3.1) is then straightforwardly given by We show in "Appendix 2" that the eigenvalues of Ω are We denote the dominant eigenvalue, λ 1 , by r , so (3.4) If r ≤ 0, the branching process {Z(t) : t ≥ 0} goes extinct almost surely. If r > 0, then r gives the asymptotic exponential growth rate of {Z (t) : t ≥ 0} (and also of The Perron-Frobenius theory implies that w 1 can be chosen so that all of its elements are positive and w 1 1 = 1. The left-eigenvector w 1 then yields a probability distribution which gives the asymptotic relative frequencies of the different types, as t → ∞, when {Z(t) : t ≥ 0} does not go extinct. Expanding (3.5) in components yields Let w(s) = k max l=0 s l w 1,l (s ≥ 0) denote the (probability-)generating function of w 1 . Multiplying (3.6) by s k and summing over k yields where fD −1 (s) = k max k=1p k s k−1 is the PGF ofD − 1 and μ W = k max k=0 kw 1,k is the mean of the distribution w 1 . Setting s = 1 in (3.7) and using (3.4) yields Note that (3.8) has a simple intuitive explanation. For large t, a typical individual gives birth at rate k max l=0 w 1,l lτ = τ μ W and dies completely (i.e. without producing any offspring) at rate γ , so the population growth rate r = τ μ W − γ and (3.8) follows using (3.4) . For It then follows, using the inversion formula which expresses the probability mass function of a non-negative integer-valued random variable in terms of its factorialmoments (see e.g. Daley and Vere-Jones 1988, p. 117), that (3.10) Observe that w 1,k max = 0 since only initial infectives can have type k max . Observe also that w 1 is determined just by the degree distribution of the network and is invariant to the epidemic parameters γ and τ . For t ≥ 0, let m(t) = (m (i) (t)) = (M(t)1) . Thus the kth element of m(t) contains the mean total population size at time t given that the process starts with a single individual whose type is k. We derive a simple expression for m(t). The following proposition is useful. Proof The second identity follows straightforwardly from the definition of the matrix exponential and the fact that y is a right eigenvector with eigenvalue c. For the first identity, (3.12) Let n = (0, 1, . . . , k max ) . Observe that Ω1 = τ n − γ 1 and Ωn = r n, (3.13) so using Proposition 1 with M = Ω, x = 1, y = n, a = −γ, b = τ and c = r , and recalling from (3.4) that r + γ = μD −2 τ , we have (3.14) Thus, While it is well known that asymptotically the mean prevalence grows exponentially with rate constant r , i.e. that prevalence ∝ e rt , these results allow us to see from (3.15) that the rate of convergence to this asymptotic behaviour is r + γ , and from (3.16) that the constant of proportionality is the degree of the initially infected individual divided by μD −2 . We also consider the relationship between the equations above and the diverse ODE approaches to the mean behaviour of the full network epidemic model. Miller and Kiss (2014) consider several such approaches; their notation can be related to ours by defining ( 3.17) Substituting (3.17) and (3.13) into (3.1) gives This pair of equations can be derived from various models considered by Miller and Kiss (2014, c.f. Sect. 3.4.1) assuming an initially small infectious population and negligible susceptible depletion. Therefore, our results suggest that the ODE approaches to mean behaviour do not require correction as the infectious population becomes extremely small, and the typical assumption that 1 I (t = 0) N for the ODE system to hold may be too conservative, with N 1 being all that is required.

Variance
The variance in infectious prevalence during the exponentially growing phase of an epidemic was considered in Graham and House (2014), but using the diffusion limit and an argument about the neighborhood around an infective. A branching process limit lets us be more explicit. For k ∈ K , let u k denote the length-(k max + 1) column vector whose element corresponding to type k is 1 and all other elements are 0, so See Dorman et al. (2004, Sect. 9), and also Athreya and Ney (1972, p. 203), for details. 1 For t ≥ 0 and k ∈ K , let v (k) (t) denote the variance of the total population size at time t given that the process starts with a single individual, whose type is k.
where we have used the first equation in (3.14) in deriving the last line. This quantity has an exact but rather complex closed-form solution, which we give below and derive in "Appendix 3".
There is a small error in the latter-in the expression for b (i) jk on p. 203 of Athreya and Ney (1972), In (4.6), if a denominator is zero then the expression is given by the limit as that denominator tends to zero. For example, if γ = 0 then I 2 (t) = te −γ t , and if γ = 2τ , the final term in I 5 (t) is replaced by te −2γ t . Equation (4.4) leads to a rather complex expression for v(t) in terms of elementary functions. However, its asymptotic form as t → ∞ is much simpler. Note that lim t→∞ e −2rt I k (t) = 0, for k = 1, 2, 3, lim t→∞ e −2rt I 4 (t) = 1 r and lim t→∞ e −2rt I 5 (t) = 1 2r +2τ +γ = 1 2μD −1 τ −γ . Substituting these limits into (4.4) and (4.5) yields (4.7) Note that both the asymptotic and exact expressions for v(t) depend on the degree distribution D only through its first three moments. It follows from (3.16) and (4.7) that, for k ∈ K , We note two features of these results. First, the Eqs. (4.4) and (4.5) involve many rates that are linear combinations of r , τ and γ , with the dominant being 2r and the subdominant being r . This leads to complex real-time behaviour as the system approaches its asymptotic limit. In the diffusion limit, only the dominant and subdominant rates are present, leading to the same overall rate of convergence r , but other rates are not present (Graham and House 2014). Secondly, the dependence of the variance on initial conditions is not simple proportionality, meaning that (4.7) contains terms proportional to both n and n 2 (unless γ = 0) and the right-hand side of (4.8) depends on k.

Covariance function
For t, s ≥ 0 and k ∈ K , let σ (k) (t, s) = cov Z (k) (t), Z (k) (s) denote the covariance of the total population sizes at times t and s in the branching process which approximates the early phase of an epidemic, given that the process starts with a single individual, whose type is k. We assume without loss of generality that t ≤ s; although this choice does not respect alphabetical order, the majority of results that follow take t as an argument rather than s, and are therefore more easily read as functions of time. Then The first term on the right hand side of (5.1) is zero, since using (3.2), the first equation in (3.14) and noting that 1 . This leads to an exact, closed-form expression for the covariance function in terms of elementary functions, which we state below and derive in "Appendix 4".
As at (4.6), an appropriate limit is taken if a denominator in (5.5) is zero. The covariance function takes a simple form in the limit as t and s → ∞. Note that lim t→∞ e −2rt I 6 (t) = 0, lim t→∞ e −2rt I 7 (t) = 1 r and lim t→∞ e −2rt I 8 (t) = 1 2μD −1 τ −γ . Substituting these limits into (5.3) yields that, for any s ≥ 0, It follows that, for k ∈ K and s > 0, where corr denotes correlation. This is not surprising since it is well known that where a.s.
−→ denotes almost sure convergence (i.e. convergence with probability 1) and W (k) is a non-negative random which satsifies W (k) = 0 if and only if the branching process goes extinct; see e.g. Athreya and Ney (1972, Theorem V.7.2).

Unbounded degree distributions
The above results have all assumed that there is a maximum degree k max . Suppose that is not the case, so the branching process has countably many types. For denote the total number individuals alive at time t. (For ease of notation we drop explict reference to the type of the initial individual.) For k max = 1, 2, . . ., let {Z(t, k max ) : t ≥ 0} denote the branching process derived from {Z(t) : t ≥ 0} by ignoring all individuals having type strictly greater than k max and any offspring of such individuals. For The process {Z(t, k max ) : t ≥ 0} behaves like the branching process described in Sect. 2.3 but with infection rate τ replaced by τ (k max ) = τ P D ≤ k max + 1 , and size-biased degree distributionD replaced byD(k max ), where The presence of k max + 1 rather than k max is because contacts with individuals having degree strictly greater than k max + 1 are ignored, as they yield individuals with effective degree (and hence type) strictly greater than k max . Now τ (k max ) → τ and E D (k max ) →E D as k max →∞, so the expression (3.15) for the mean total population size at time t continues to hold in the unbouded degree case, provided that E D < ∞, or equivalently that E D 2 < ∞. A similar argument shows that the expressions for the variance of Z (t) and the covariance of Z (t) and Z (s), derived in Sects. 4 and 5, respectively, continue to hold provided E D 2 < ∞, or equivalently E D 3 < ∞.

Probability of extinction
For t ≥ 0 and k ∈ K , let π k (t) = P Z (k) (t) = 0 be the probability that the branching process is extinct at time t given that it starts with one individual of type k. Then in general d dt π k (t) = −ω k π k (t) + ω k P k (π (t)), (7.1) where π(t) = (π i (t)). For our specific model, using (2.1) and (2.3), we have These equations are not amenable to closed-form solution. Note, however, that studies of time to extinction for network epidemics-e.g. Holme (2013)-have tended to be based on Monte Carlo methods, but (7.2) could provide a complementary approach that is numerically cheaper and more analytically tractable.
We will now consider three regimes in which asymptotic methods can be used to bound the real-time behaviour of the probabilities of extinction. In particular, we will see that early real-time behaviour is bounded by the death rates ω k , while late-time behaviour is bounded by the asymptotic real-time growth rate r provided r > −γ .

Late behaviour of the subcritical case
Suppose that r < 0, so the branching process is subcritical. For t ≥ 0 and k ∈ N 0 = {0, 1, . . .}, we will work with the probability of survival q k , so using (3.15) a simple upper bound for q k (t), valid also in the unbounded degree setting using the results in Sect. 6, is Note that μD < ∞ is a necessary condition for r < 0. Under the stronger condition that μD 2 < ∞, Windridge (2015) gives an exponential approximation, for large t, to a quantity closely related to q k (t).
He assumes that what we call type-0 individuals are dead. For k = 1, 2, . . ., letq Then, Windridge shows that there exists a constantĉ ∈ (0, 1] such that, for any a < min{τ, −r }, for any k ≥ 1. The constantĉ = lim t→∞ e −rtq 1 (t). Note that for some practical purposes,q k (t) may be of more interest than q k (t), since type-0 individuals are unable to transmit infection. In particular, in "Appendix 5" we sketch the argument that for the case where r > −γ an analogous result to (7.4) holds, i.e. , for k ≥ 1, q k (t) ∼ cke rt as t → ∞, where c = lim t→∞ e −rt q 1 (t) > 0.
(7.5) (For real-valued functions, f and g say, For the case where r < −γ (so, from (3.4), μD < 2) we show in "Appendix 5" that if μD 2 < ∞ then, for k ≥ 0, (7.6) Note that in this case the asymptotic behaviour of the survival probability q k (t) is independent of the infection rate τ . The case r < −γ could occur, for example, at the end of an epidemic where τ γ . Such an epidemic would consist primarily of transmission events at early times, with the late behaviour dominated by recovery events.

Late behaviour of the supercritical case
An approximation to q k (t) in the supercritical case (r > 0) can be obtained by exploiting the fact that a supercritical branching process conditioned on extinction is probabilistically equivalent to a subcritical branching process. For k ∈ N 0 , let T (k) = inf{t ≥ 0 : Z (k) (t) = 0} denote the extinction time of the branching process given that it starts with one individual of type k, where T (k) = ∞ if the branching process survives forever, and let π k = P T (k) < ∞ = π k (∞) be the probability that the branching process ultimately goes extinct. Then, Then it follows from Waugh (1958, Sect. 5), that {Z (k) (t) : t ≥ 0} is also a continuous-time multitype Markov branching process, in which the lifetime of a typical type-k individual has an exponential distribution with rate γ + τ k, as at (2.1), but when it dies its offspring is now distributed as follows: Suppose now that there is a maximum degree size k max . LetΩ = [Ω l,k ] be the (k max + 1) × (k max + 1) matrix with elements given bỹ Then, recalling (3.2), Letr denote the dominant eigenvalue ofΩ and note thatr < 0. For t ≥ 0 and k = 0, 1, . . . , k max , letq k (t) = P Z (k) (t) > 0 be the probability that the branching process {Z (k) (t) : t ≥ 0} is not extinct at time t given that it starts with one individual of type k. Then we expect that arguments similar to those used in the proof of Heinzmann (2009, Theorem 3.1), will show that there exists constantsc 1 ,c 2 , . . . ,c k max , satisfying 0 <c k < ∞, such for k = 1, 2, . . . , k max , for anyγ > 0. It then follows using (7.7) that Heinzmann (2009, Theorem 3.1), cannot be applied directly as it assumes that the matrixΩ is irreducible. We do not consider it here but we expect that Heinzmann's proof can be extended to our situation. If we assume that type-0 individuals are dead and only consider initial individuals of types 1, 2, . . . , k max − 1 (recall that only initial infectives can have type k max ) thenΩ becomes a (k max − 1) × (k max − 1) irreducible matrix. Heinzmann (2009, Theorem 3.1), then yields (7.10); note that now π k is replaced byπ The approximation (7.11) then holds with q k (t) and π k replaced byq k (t) andπ k , repsectively. Moreover, if we then letf 1 andb 1 be left and right eigenvectors ofΩ corresponding to the eigenvaluer , satisfyingf 1b 1 = 1, thenc k = u kb 1 h * , where h * = lim t→∞ e −rtf 1q (t) and q(t) = q i (t),q 2 (t), . . . ,q k max −1 (t) . Unfortunately, unlike with Ω, there do not appear to be closed-form expressions forr and its associated eigenvectors.

Early behaviour and matched asymptotics
Matched asymptotics is a standard technique in mathematical biology for writing down approximations to non-linear models that match known asymptotic behaviour (Murray 2002(Murray , 2003. While numerical solution of the ODEs (7.2) is efficient (as we have noted above) we now obtain a crude approximation to the full system that takes a closed form in terms of elementary functions.
First note that for k ∈ K , π k (0) = 0 and π k (t) is monotonically increasing with t. If we neglect the quadratic terms in π in (7.2) then, since these are only positive and increasing over time, we get a lower bound for the extinction probabilities: Note that in standard matched asymptotics, we would identify a small parameter from a ratio of rate constants as the basis for a systematic approximation scheme (Murray 1984); an alternative would be to approximate systematically by, for example, letting k , substituting into (7.2) and neglecting quadratic terms to give a linear set of equations for the next order of approximation. Here we consider only the lowest order approximation, and hence define an 'internal' solution for the survival probability as: Next, supposing we are in the subcritical case so that our result (7.5) holds. We will call this the 'external' solution (7.14) To fix the constant c, we match the late behaviour of the internal solution with the early behaviour of the external solution: Finally, the matched asymptotic solution is We compared this approximation as well as the internal and external solutions to the exact solution q k (t), with results shown in Fig. 1. As advertised, this is a relatively crude approximation, but is expressed in terms of elementary functions and satisfies known asymptotic limits.

Fluctuations in the emerging phase of a major outbreak
We now consider the early behaviour of supercritical epidemics that take off (i.e. do not go extinct early on but ultimately end owing to long-term depletion of susceptibles). The early stages of such an epidemic are approximated by the branching process defined in Sect. 2.3 but conditioned on non-extinction. It is straightforward to adapt the results on means and variances in Sects. 3 and 4 to condition on Z (k) (t) > 0. Elementary calculation shows that, for t ≥ 0 and k ∈ N 0 , Expressions for E Z (k) (t) Z (k) (t) > 0 and var Z (k) (t) Z (k) (t) > 0 then follow using (3.15) and (4.4), respectively, though there is no closed-form formula for q k (t) or π k (t). Note that, assuming r > 0 so π k < 1, which depends on the degree k of the initial infective. The diffusion approximation studied in Graham and House (2014) corresponds to the case where the number of infectives at time t = 0 is large. Return to the case where there is a maximum degree k max and suppose that the branching process does not go extinct. Then it follows from Athreya and Ney (1972, Theorem V.7.2), that, for any k ∈ K , where w 1 , given by (3.10), is a left eigenvector of Ω corresponding to the dominant eigenvalue λ 1 = r . Thus if an epidemic takes off and is still in its exponentiallly growing phase then the relative frequencies of the different types of infectives will be close to w 1 . Hence, we now assume that the initial number of individuals in the branching process Z (0) = K , where K is large, and that Z i (0) ≈ w 1,i K for i ∈ K . Label the initial individuals 1, 2, . . . , K . Then, for t ≥ 0, the total population size is Z (t) =Ẑ 1 (t) +Ẑ 2 (t) + . . . +Ẑ K (t), whereẐ i (t) denotes the total number of descentants of the initial individual i that are alive at time t, including i itself if it is still alive. Thus, , for all t ≥ 0, and, since the processes {Ẑ i (t) : t ≥ 0} (i = 1, 2, . . . , K ) are mutually independent, var (Z (t)) = K i=1 var Ẑ i (t) , for all t ≥ 0, and cov (Z (t), Z (s)) = K i=1 cov Ẑ i (t),Ẑ i (s) , for all t, s ≥ 0.
Assuming that the above approximation is exact, then, for t ≥ 0, it follows from (3.15) that and, after a little algebra, it follows from (4.4) that var (Z (t)) = K w 1 v(t) where σ 2 D = var(D) and I 9 (t) = e rt e rt − 1 r .
Comparison of (8.1) and (8.2) with the diffusion-based result of Graham and House (2014) in the limit of large t gives agreement when γ = 0 (i.e. for the SI model) but not for γ > 0. We believe that this is due to the fact that the diffusion model was only four dimensional, so a heuristic argument (given in Sect. 3.3 of Graham and House 2014, which gave results that were in good agreement with simulation) about the neighbourhood of an infective node had to be made, in contrast to the approach here that deals with each effective degree explicitly and so has k max + 1 dimensions. The argument about the neighbourhood around an infective tries to account for correlations caused by variability in recovery times, and so if γ → 0 then these correlations do not exist. Recent work by Constable and McKane (2014) considered the reduction of highdimensional stochastic models to low-dimensional diffusions and this approach was shown to be asymptotically exact for some systems in the small-noise limit by Parsons and Rogers (2015). It is an open question whether the argument in Sect. 3.3 of Graham and House (2014) could be justified rigorously by a similar argument, however we note that a branching-process approach makes fewer assumptions than a low-dimensional diffusion limit and so will be more generally applicable.
Considering further results that can be obtained, it follows from (5.3) and a little algebra that, for 0 ≤ t ≤ s, It seems plausible that these results extend to the case when there is no maximal degree but that would involve results for countably infinite matrices which we do not consider here.
Recall that the processes {Ẑ i (t) : t ≥ 0} (i = 1, 2, . . . , K ) are mutually independent. It follows using the central limit theorem that, for sufficiently large K , the process {Z (t) : t ≥ 0}, which approximates the prevalence of infection during the early growth of an epidemic, is approximately Gaussian with mean function given by (8.1) and covariance function given by (8.3).

Numerical examples 9.1 Forward simulations
We conducted a series of numerical experiments to provide specific examples of the general results presented here. M = 10 4 Monte Carlo simulations were performed on three different configuration model networks, each of size N = 10 4 , and with the degree distributions shown in Fig. 2 Row (a). (Note that each Monte Carlo simulation consisted of first simulating a network and then simulating a single epidemic on it.) Two different scenarios were considered. In the first -most commonly considered in the literature when simulations are compared to analytic approachestime = 0 was defined as the first point when prevalence is at a given level, K . In our simulations we took K = 100, but in general K should take a value where the probability of extinction has become negligible, but the depletion of the susceptible population has not had a significant effect on the epidemic dynamics. In the second, each epidemic was started from one node, picked uniformly at random, so the probability of extinction played a major role. This scenario is less commonly considered when comparing real-time simulated epidemics to differential equation models because the latter are typically designed to hold when the epidemic is already established.
Since analytic results for the probabilities of extinction π k (t) are not available, the branching process results required numerical integration of ordinary differential equations (in our case using Runge-Kutta methods). We stress that the computational effort required to do this is much less than that involved in performing Monte Carlo simulations, and has the benefit of not depending on N .
The results for the first approach (restarting time at the first time prevalence reaches 100) are given in Fig. 2. Row (b) shows sample trajectories (which all agree on prevalence at time 0). Row (c) shows the simulated mean after time 0 on a logarithmic scale, which initially grows at the constant rate predicted by the branching process model, and then reduces as the susceptible population is depleted. Row (d) shows the variance, which has not converged to its asymptotic growth regime by the time prevalence is equal to 100, an effect that is captured by the branching process model. The variance does not take its largest value at the peak prevalence, but instead has local maxima before and after the peak. Figure 3 shows the results for the second approach in which there is one randomly chosen initial infective at time 0. Row (a) shows some sample trajectories. Row (b) shows the extinction probabilities, which are accurately captured in the branching process model until very close to the end of the epidemic when prevalence is low and extinction becomes likely again. Row (c) shows the mean, which does not start growing at a constant rate with the convergence rate accurately captured in the branching process model. Row (d) shows the variance and convergence onto its asymptotic value; in this case there is a single maximum just before the peak in prevalence. Another important point is that while mean numbers infected are comparable between Figs. 2 and 3, the variability in the time for the epidemic to take off, as well as the contribution from extinct epidemics, makes the real-time variance in Fig. 3 orders of magnitude larger than in Fig. 2.

Statistical inference
In order to demonstrate the potential use of the real-time effective degree branching process model for statistical inference, we carried out a simulation study. Here we simulated one epidemic that took off on a configuration model network of size N = 10 6 with degree distribution D (3) as in the right-hand column of Fig. 2 (d (3 9 = 1/24) and true rates τ 0 = 2, γ 0 = 1. Letting I (t) be the prevalence of infection in the network model, we set time t = 0 when I (t) = 100 for the first time and make 40 evenly-spaced observations (with gap δt = 0.05 between each) of I (t) up to t end = 2.
We then define an approximate likelihood based on the methods of Ross et al. (2006), in which a Gaussian process approximation based on known first and second moments is used, which will be more accurate for larger N , larger I (0) and smaller δt. There should be, however, no a priori obstacle to fitting our model to data on smaller populations even with incomplete data, for example by using Markov chain Monte Carlo methods to perform multiple imputation as suggested by O'Neill and Roberts (1999).
Explicitly, we let the probability density function f for sequential observations be given by  Figure 4 shows the first quarter of the simulated epidemic together with the Gaussian process approximation, as well as likelihood surfaces for the correct and misspecified degree distributions. Performing maximum likelihood estimation using MATLAB's mle() function allows us to obtain point estimates for parametersτ andγ , as well as asymptotic 95% confidence intervals and the parameter covariance matrixĈ from the inverse Hessian. We quote results to 2 significant figures; the asymptotic approximations also give very slightly negative lower confidence intervals forγ which we round up to 0. For the correct degree distribution we obtain  This shows that knowledge of the correct distribution allows both τ and γ to be estimated; although as would be expected the early asymptotic growth rate r is much more closely constrained by simulated data than other directions in parameter space. It also shows that misspecification of the degree distribution allows r to be identified, but biases the estimate of, in this case, τ .
10 Concluding comments

Summary of results
In this paper, we have provided explicit closed-form expressions for the real-time mean, variance and covariance function for disease prevalence during the early stages of the Markovian SIR model on a configuration model network, as well as deriving differential equations for the probabilities of extinction over time that are relatively numerically cheap to solve. These allow for a more explicit treatment of e.g. rates of convergence to asymptotic behaviour than has previously been possible.

Future directions
We believe that the methods of real-time, multitype branching processes could be more widely applied in infectious disease epidemiology, since they provide results concerning extinction and variance that are not available using deterministic differential equation models. For example, the effective-degree based methodology presented here may be extended to include degree correlation (e.g. in the sense of Newman 2002) by keeping track of the actual, as well as effective, degrees of individuals, though the type space becomes larger and explicit analytic results are unlikely to be available. We note that there is increasing interest in the eradication of infections (e.g. Klepac et al. 2013) and that arguably calculating extinction probabilities and variability in outbreak sizes is of equal or greater importance in this context than calculation of mean behaviour. The explicit closed-form expressions derived have the potential to enhance statistical work on epidemic prevalence curves. In particular, many empirically observed epidemics of human pathogens exhibit more variability around the trend than simple models would predict (see Black et al. 2014, particularly Section 1, for a discussion of this), which can bias parameter estimation if an insufficiently variable model is used. Application of our methods to real data would be an interesting extension of our work.
The possibility of a more general non-Markovian stochastic epidemic being approximated by an appropriate real-time branching process is raised by the results of Barbour and Reinert (2013) and it would be interesting to investigate whether our analysis could be adapted to this scenario.
Finally, there is the question of low-dimensional PGF-based modelling of the whole network epidemic that incorporates stochasticity accurately. For example, the work of Miller (2014) considered accounting for early fluctuations and Graham and House (2014) considered a diffusion approximation once early fluctuations were negligible, but the results presented here as well as those of Barbour and Reinert (2013) suggest that a more unified low-dimensional stochastic approach that explicitly models early fluctuations may be possible.
Josh Ross for helpful comments on this manuscript. We would also like to thank the referees and associate editor for their constructive comments which have improved the presentation of the paper.
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Appendix 1: Convergence of moments
We determine sufficient conditions for the first two moments of the number of infectives in the epidemic E N among a population of N individuals to converge to the corresponding moments of the limiting branching process B. For ease of exposition, in E N we assume that at time t = 0 there is one infective and the remaining N − 1 individuals are susceptible. The initial infective is chosen by sampling a stub uniformly at random from all stubs used to form the network, with the individual attached to that stub being the initial infective. The arguments are easily extended to other choices of initial infective(s).
In the independent and identically distributed (i.i.d.) degree case, we assume that a single sequence D 1 , D 2 , . . . of i.i.d. copies of D is used to construct a sequence of epidemics (E N ), where, for N = 1, 2, . . . , the epidemic E N is constructed using D 1 , D 2 , . . . , D N . In the prescribed degree case (see Sect. 2.1), recall that p (N ) k (k = 0, 1, . . . ) denotes the empirical degree distribution in the epidemic E N and, for f : k). For N = 1, 2, . . . and t ≥ 0, let Y N (t) be the number of infectives in E N at time t and let Z (t) be the number of individuals in the limiting branching process B. Then arguing as in the proof of (Ball and Neal 2008, Theorem A.1), shows that in the i.i.d. degree case, if μ D < ∞ then the sequence of epidemics (E N ) and the limiting branching process B can be constructed on a common probability space so that, with probability one, for any t > 0, Y N (u) and Z (u) coincide for all u ∈ [0, t] for all sufficiently large N . The same conclusion holds in the prescribed degree case provided p (N ) k → p k (k = 0, 1, . . .) and μ (N ) D → μ D as N → ∞, where ∞ k=0 p k = 1 and μ D < ∞. Thus, under these conditions, in both cases, for any t ≥ 0, Y N (t) converges almost surely to Z (t) as N → ∞. We obtain further conditions, under which, for fixed t ≥ 0, the sequence Y N (t) 2 is uniformly integrable, which then (e.g. Grimmett and Stirzaker 2001, Chap. 7, Sect. 10) Z (t)). To show that Y N (t) 2 is uniformly integrable it is sufficient to show that the sequence E Y N (t) 2+δ is bounded above for some δ > 0.
For ease of exposition, we use notation from the i.i.d. degree case. The construction in Ball and Neal (2008) involves for each N constructing a realisation of a branching process, B N say, which is defined analagously to B but using the empirical distribution of D 1 , D 2 , . . . , D N rather than the distribution of D. In B N , for each birth a stub is chosen independently and uniformly from all the D 1 + D 2 + · · · + D N stubs and the degree of the individual that the chosen stub belongs to gives the degree of the individual born at that birth. The process of infectives in E N follows B N except when (i) a sampled stub has previously been chosen, in which case stubs are resampled until one that has not been chosen previously is obtained, or (ii) a sampled stub has not been chosen previously but is attached to an individual that has already been infected, in which case the corresponding birth and all descendants of that individual in B N are ignored in E N . Note that (i) implies that B N need not be an almost sure upper bound for the process of infectives in E N . The branching processes B N (N = 1, 2, . . . ) and B are coupled so that with probability one, for any fixed t > 0, B N and B coincide over By Markov's inequality, (10.1) We now bound E T N , (t) 2+δ , for δ ∈ (0, 1). Note that this moment is smaller than the corresponding moment for the branching process in which γ = 0 and individuals retain their original effective degree throughout their lifetime. Moreover, by rescaling the time axis, we can assume without loss of generality that τ = 1. Thus consider a multitype Markov birth process, with types 0, 1, . . . , J , in which an individual of type i gives birth at rate i and the types of successive births are i.i.d. with probability mass functionp j ( j = 0, 1, . . . , J ). For i = 0, 1, . . . , J , letT i (t) denote the total number of individuals alive in this process at time t given that at time 0 there is one indivdiual, whose type is i. For α > 0 and t ≥ 0, letμ (α) 0, 1, . . . , J ) and letμ (α) It is possible using a backward argument to derive explicit expressions forμ (k) i (t) for k = 1, 2, . . . , though the algebra soon becomes very tedious. As our aim is to boundμ (2+δ) (t), we simply derive bounds forμ (1) i (t),μ (2) i (t) (i = 0, 1, . . . , J ) and finallyμ (2+δ) (t). Moreover, our bounds are deliberately coarse to facilitate easy application to E T N , (t) 2+δ . For f : Multiplying (10.2) byp i and summing over i = 0, 1, . . . , J , yields Substituting this bound into (10.5) and noting that the right-hand side of (10.5) is increasing in t yieldŝ where g(t,μ D 2 ) = 1 + t 1 + 2(1 + t)(1 +μ D 2 t) . (10.7) Let δ ∈ (0, 1). For i = 0, 1, . . . , J , the backward equation forμ (10.8) whereT i (t) andT (t) are independent andT (t) is distributed as a mixture of T 0 (t),T 1 (t), . . . ,T J (t) with mixing probabilitiesp 0 ,p 1 , . . . ,p J . Let a and b be nonnegative real numbers. Application of the mean value theorem yields that Setting a =T i (t) and b =T (t) in this inequality, taking expectations exploiting the independence ofT i (t) andT (t), substituting into (10.8) and noting that 2 δ (2 + δ) < 6 gives by Jensen's inequality, sô using (10.6) and noting that g(t,μ D 2 ) ≥ 1. Substituting (10.10) into (10.9), multiplying byp i and summing over i = 0, 1, . . . , J , yields Hence,μ We return to epidemics on networks and introduce some more notation. For ∈ (0, 1), let k 0 ( ) = min{k : p 0 + p 1 + . . . , p k > } and, for k = 0, 1, . . . , let As above, we assume without loss of generality that τ = 1.
Assume first that p 0 = 0, so μ D (N ) (ε) ≥ 1 almost surely. Now A similar argument to (10.13) yields μ (D (N ) The above arguments show that there exists h 2 (ε, t) < ∞ such that h 2 (N , N (ε, t). Thus, recalling (10.12), the sequence E Y N (t) 2+δ is bounded and, for any α Suppose now that p 0 > 0. Then D st ≤ D , where D has distribution given by P(D = 1) = p 0 + p 1 and P(D = k) = p k (k = 2, 3, . . . ). It follows that for fixed population size N , fixed ε ∈ (0, 1) and any t ≥ 0, is the total progeny at time t of the branching process defined analagously to B N ,ε but using the empirical distribution of D 1 , D 2 , . . . , D N , where D 1 , D 2 , . . . , D N are i.i.d. copies of D . Further, M D 2 (θ 0 ) < ∞ if M D 2 (θ 0 ) < ∞ and the above argument can be used to show that the sequence E Y N (t) 2+δ is bounded.
The above argument is easily adapted to show that in the prescribed degree case under the weaker condition that there exists δ > 0 such that μ D 2+δ < ∞ and μ (D (N ) ) 2+δ → μ D 2+δ as N → ∞. Moreover, although we have not worked through all of the details, it seems likely that the argument can also be adapted to prove that, for any α > 1, if there exists δ > 1 such that μ D α+δ < ∞ and μ (D (N ) for all t ≥ 0. Again we have not worked through all of the details but in the i.i.d. degree case it seems likely that the above condition on M D 2 (θ ) quarantees that for all α, t ≥ 0. Finally, in the i.i.d. degree case it seems likely that weaker conditions will suffice when the limiting branching process B is subcritical, i.e. when r < 0, as in that case the exponential functions appearing in E T N ,ε (t) 2+δ , prior to taking expectations, will all have negative arguments.

Appendix 5: Late behaviour of subcritical survival probabilities
To bound the late probabilities of survival in the subcritical case, first note that due to the inability of type-0 individuals to transmit we have q 0 (t) = e −γ t . (10.24) Recalling that q k (t) = 1−π k (t), it follows from (7.2), with k max = ∞, that for k > 0, The limit lim t→∞ e −rt q 1 (t) therefore exists if r is within the region of convergence of the Laplace transform of h. Considering h 1 (t) in (10.27), since q k (t) ∈ [0, 1], for all k, we have thatp 1 ≤ h 1 (t) ≤ τ (1 +p 1 )e −γ t , so the Laplace transform of h 1 converges by the assumption that r > −γ . Now considering h 2 , we follow Windridge (2015) and consider an initial individual with effective degree k, and stubs labelled by integer i = 1, 2, . . . , k. Let T i be the time that the individual and its progeny through stub i exist, and let T be the lifetime of the branching process. Then, using the Bonferroni inequalities as in Windridge (2015), for k ≥ 1, {T i > t}, q k (t) ≤ kq 1 (t) and q k (t) ≥ kq 1 (t) − k 2 P(T 1 > t, T 2 > t).
We then recall that P(R > t) = q 0 (t) = e −γ t , after which the argument follows closely that of Windridge (2015) (as in the derivation of (10.36) below) and we find that lim t→∞ ∞ 0 e −rt P(T 1 > t, T 2 > t)dt < ∞. (10.30) Thus the Laplace transform of h 2 converges at r , whence lim t→∞ e −rt q 1 (t) is equal to a finite constant, c say. Note that c is strictly positive since c ≥ lim t→∞ e −rtq 1 (t) = c > 0. Further, (10.30) implies that lim t→∞ e −rt P(T 1 > t, T 2 > t) = 0, and it follows from the two inequalities in (10.28) that, lim t→∞ e −rt q k (t) = kc, for k ≥ 0, proving (7.5).