Lessons from previous predictions of HIV/AIDS in the United States and Japan: epidemiologic models and policy formulation

This paper critically discusses two previous studies concerned with predictions of HIV/AIDS in the United States and Japan during the early 1990s. Although the study in the US applied a historical theory, assuming normal distribution for the epidemic curve, the underlying infection process was not taken into account. In the Japan case, the true HIV incidence was estimated using the coverage ratio of previously diagnosed/undiagnosed HIV infections among AIDS cases, the assumptions of which were not supported by a firm theoretical understanding. At least partly because of failure to account for underlying mechanisms of the disease and its transmission, both studies failed to yield appropriate predictions of the future AIDS incidence. Further, in the Japan case, the importance of consistent surveillance data was not sufficiently emphasized or openly discussed and, because of this, revision of the AIDS reporting system has made it difficult to determine the total number of AIDS cases and apply a backcalculation method. Other widely accepted approaches can also fail to provide perfect predictions. Nevertheless, wrong policy direction could arise if we ignore important assumptions, methods and input data required to answer specific questions. The present paper highlights the need for appropriate assessment of specific modeling purposes and explicit listing of essential information as well as possible solutions to aid relevant policy formulation.


Additional file
Supplement to: Nishiura H: Lessons from previous predictions of HIV/AIDS in the United States and Japan: Epidemiologic model and policy formulation. In: Epidemiologic Perspectives and Innovations.

Backcalculation: Reconstruction of the HIV epidemic
Backcalculation is a useful technique for estimating HIV prevalence and obtaining short-term projections of AIDS incidence based on previous AIDS incidence data [1][2][3]. Whereas the number of AIDS cases is thought to be relatively accurately documented in industrialized countries, asymptomatic HIV infections are seldom noticed unless the infected individual undertakes a voluntary blood test or develops the disease. The long incubation period of HIV infection enables assessment of the extent of the epidemic during its course. Backcalculation uses AIDS incidence data at time t, a(t), and the incubation period distribution at time τ after infection, ω(τ), to reconstruct the number of HIV infections with time. Assuming that documentation of diagnosed AIDS cases is not significantly delayed, and assuming the impact of antiretroviral therapy on the length of the incubation period is negligible in the simplest setting, the fundamental relationship is given by the following convolution equation: where h(u) is the number of HIV infections at time u. The basic idea of backcalculation is to obtain h(u) using a(t) and ω(τ-u). Here, to ease understanding of the deconvolution procedure, eqn. (1) is considered in discrete time [4,5]. Since surveillance-based data of AIDS incidence is obtained for a certain interval, t (e.g., every 2 or 3 months), the following equation is obtained: Assuming that h t is generated by a nonhomogeneous Poisson process, a t is an independent Poisson variate. Thus, the likelihood, which is needed to estimate HIV infections and the parameters of incubation period distribution, is proportional to: where r t is the observed number of AIDS cases at time t and T is the most recent time of observation. The shape of the curve of HIV infections, h t , is usually modeled parametrically or non-parametrically. The main sources of uncertainty arise from uncertainties in the incubation period distribution, the shape of the HIV infection curve, and AIDS incidence data [6]. Short-term predictions are obtained based on estimated numbers of HIV infected individuals who have not yet developed AIDS.
However, it should be noted that backcalculation such as this provides no information about future infection rates and little information about recent infection rates [7].

The prediction method employed in the United States case study
In the United States case study, future predictions of AIDS incidence based on the annual number of AIDS diagnoses from 1981-7 were estimated using the second ratio of incidence, which reflects the annual incidence of AIDS (see Table S1). That is, empirically, assuming that the second ratio of AIDS incidence is constant, and thus, the epidemic curve suffices to fit a normal curve, the first ratio and future AIDS incidence can subsequently be obtained arithmetically [11]. In the original study [12], the fixed second ratio was determined as 0.8647 using the mean ratio from 1985-7; Table 1 also follows this estimate.
Farr's law was formalized in detail by John Brownlee (1868-1927) [13] who, based on the observational notes of Farr on the temporal pattern of smallpox death, showed that epidemics in general tend to follow a symmetric bell-shaped curve that can be approximated by normal distribution [14,15]. The major aim of Brownlee in extending this theory was to further investigate the time-series decline of transmission potential (i.e., infectiousness) during the course of an epidemic, which he failed to do (excellent historical reviews of Brownlee's efforts are given elsewhere [16,17]).
Assuming that the second ratio of AIDS incidence, which reflects the annual incidence, is constant over time, the following fundamental equation is obtained: where a t and C are the AIDS incidence at time-interval t and the assumed constant second ratio, respectively. Brownlee found that eqn. (4) can also be described by the second differences of the following logarithms: When lna = A the above equation can be expressed by: Thus, the solution to eqn. (6) can be obtained from the integral: In other words, a negative second-degree exponential function, which describes a type of normal distribution, is obtained [13,16,17].

A theoretical flaw of the AIDS projection in the United States
The most significant flaw lies in the underlying theory. Provided that the assumption of a normal distribution had been empirically confirmed for other infectious diseases [18] and that functions similar to those given in eqn. (7)  Brownlee could not clarify the mechanism behind the normal curve according to the underlying infection process [11,15,16], and consequently, Bregman and Langmuir adopted an assumption not clarified in the explicit bottom-up fashion [12].
Integrating eqn. (9) from 0 to t yields: where R(0) = 0. Assuming that the total number of individuals in the population is constant N = S(t) + I(t) + R(t) for any t, the third subequation of eqns. (8) can be written as: Approximation of eqn. (11) by Taylor series expansion yields: This can be solved by standard methods [22] yielding: Therefore, the epidemic curve is given by: generating a symmetrical bell-shaped epidemic curve. In this way, the epidemic curve obtained using the Kermack and McKendrick model results in a symmetric shape, reflecting the decline of susceptible individuals (when β is constant over time). This indicates that the underlying epidemiologic process of the epidemic curves described by normal family (i.e., those which can be generalized with a type of normal distribution) partly originates from these non-linear dynamics. Bregman and Langmuir's study on the projection of HIV/AIDS in the United States [12], which simply applied the original historical theory to the data, did not validate the underlying intrinsic transmission dynamics using firm knowledge. In other words, they did not take into account the reason behind the assumption of a normal curve.
A technical flaw of the fixed coverage ratio employed in the Japan case study Here I analytically examine the validity of applying the fixed coverage ratio to estimates of the true HIV incidence in Japan. Fig. S1 shows a schematic illustration of the four compartments required for this analysis. New HIV infections join compartment h u , undiagnosed HIV infections, as a function of time t, η(t).
where γ 1 and γ 2 are unrealistically assumed to follow exponential distributions, a simplification that allows us to find the analytical solution. In the following analysis, the disease age (i.e., time since HIV infection) is ignored, and diagnosed and undiagnosed individuals are assumed to develop AIDS at the same rate, γ (= γ 1 = γ 2 ).
However, it should be noted that it is preferable to take into account the disease age (known as the d-state [23]) with slowly progressing diseases such as HIV/AIDS; analysis taking into account disease age is given elsewhere [24]. The rate of diagnosis, α, is assumed to be sufficiently small compared to γ and independent of time so that the coverage ratio in the original study in relation to time could be analytically Thus, the ratio of HIV and AIDS diagnoses at time t is given as follows: