Estimating a Treatment Effect in Residual Time Quantiles Under the Additive Hazards Model

For randomized clinical trials where the endpoint of interest is a time-to-event subject to censoring, estimating the treatment effect has mostly focused on the hazard ratio from the Cox proportional hazards model. Since the model’s proportional hazards assumption is not always satisfied, a useful alternative, the so-called additive hazards model, may instead be used to estimate a treatment effect on the difference of hazard functions. Still, the hazards difference may be difficult to grasp intuitively, particularly in a clinical setting of, e.g., patient counseling, or resource planning. In this paper, we study the quantiles of a covariate’s conditional survival function in the additive hazards model. Specifically, we estimate the residual time quantiles, i.e., the quantiles of survival times remaining at a given time t, conditional on the survival times greater than t, for a specific covariate in the additive hazards model. We use the estimates to translate the hazards difference into the difference in residual time quantiles, which allows a more direct clinical interpretation. We determine the asymptotic properties, assess the performance via Monte-Carlo simulations, and demonstrate the use of residual time quantiles in two real randomized clinical trials.


Introduction
In most randomized clinical trials, estimating a treatment effect typically fits a Cox proportional hazards model [3] on censored time-to-event endpoints. The Cox model offers many advantages: the ability to handle censored data, no need for parametric assumptions about the baseline hazard function, interpretable regression coefficients in hazard ratios, and readily available software for implementation. However, not all the data collected may satisfy the assumption of proportional hazards. For this reason, many alternative models have been developed, see Kalbfleisch and Prentice [11].
One useful alternative is the so-called additive hazards model, as in Lin and Ying [16]. The additive hazards model can also handle censored data, and makes no parametric assumptions about the baseline hazard function; but its regression coefficient is interpreted in hazards difference. For example, Kalbfleisch and Prentice [11] contains a well-known oropharynx carcinoma trial dataset. In the trial, a total of 195 patients with squamous cell carcinoma were randomly assigned treatment with radiation therapy alone or together with a chemotherapeutic agent. When the additive hazards model is applied to the dataset, the hazards differences are estimated to be 0.08 and 0.35 for tumor stages III and IV, respectively, compared to stage I/II. Although a hazards difference can be used to measure a treatment effect similar to the hazards ratio of the Cox proportional hazards model, the hazards difference by itself is usually less intuitive to care providers and patients in real clinical settings. As a contrast, difference in event times may be more easily understood. In particular, when the patients are still event-free at time t, say, then comparing their remaining event times after t can be more meaningful. In literature, similar comparisons have been done for the event time quantiles that were based on their empirical survival functions [1]. Dabrowska and Doksum [5] developed methods to estimate quantiles based on the Cox model. For median event times, Ying et al. [10] and Yang [19] used semiparametric regression models for more general estimation.
However, event time quantiles in clinical trials may not always be clinically meaningful, because their time-zero are usually artificially selected at enrollment or randomization. Therefore, estimate of median survival time, e.g., for a patient diagnosed with a disease with high initial hazard that drops over time may not be relevant when considering their outlook after the initial spike in hazard. In this paper, we instead consider residual time quantiles. A residual time is the amount of survival time remaining at a given time t, assuming survival to t. Estimating residual time quantiles is usually a generalization of that for estimating survival time quantiles, in which t ≡ 0.
Methods for residual time quantiles began to accumulate recently. Jeong, Jung, and Costantino [9] proposed nonparametric methods for estimating median residual time (MRT) by inverting Kaplan-Meier estimators [12]. In a follow-up paper, Jung, Jeong, and Bandos [10] proposed a regression model that allows for modeling of covariate effects on general quantile residual time. A different regression model is proposed by Ma and Yin [17], allowing for estimation of quantiles of residual times in addition to covariate effects on them. In a recent paper, Crouch et al. [4] developed covariate-specific estimators for residual time quantiles based on the Cox model.
In this article, we aim to develop an estimator for residual time quantiles under the additive hazards model. We begin in Sect. 2 to show that the newly developed estimator based on the additive hazards model allows for estimation of covariate-specific residual time quantiles. We demonstrate our estimator's consistency, determine its limiting distribution, and provide a consistent estimator for its variance. Also included are discussions of methods for obtaining confidence intervals and bands that do not rely on direct estimation of the variance. We further develop our method in Sect. 3, determining the limiting distribution for a difference between two estimators of covariate-specific residual time quantiles and thereby allowing formal testing. In Sect. 4 we demonstrate our estimator's performance on simulated data, including figures showing confidence intervals and bands. Additionally, we apply our method to two real datasets: the VA lung cancer dataset in Kalbfleisch and Prentice [11] and the Human Immunodeficiency Virus (HIV) Mother-to-Child transmission prevention trial dataset in Jackson et al. [8]. Finally, in Sect. 5 we discuss the mean residual times, and extension of our method to allow for time-varying covariates, and low event rates.

Model-Based Estimation of Residual Time Quantiles
Assume that T is a positive random variable representing a subject's time-to-event. At a given time t, we first define the (1 − q)th (0 < q < 1) percentile residual time of a random variable T as the amount of additional time necessary for (1 − q) × 100% of the individuals still under observation at time t to fail. We denote this quantity as where S(·|Z) is the survival function. If expressed in terms of the cumulative hazard function, (·|Z) = − log S(·|Z), we then have {t + θ(t, q|Z)|Z} = (t|Z) − log q. Now consider the additive hazard model that assumes where 0 (·) is the unspecified baseline cumulative hazard function, β ∈ B ⊂ R p is the p-dimensional regression parameter, and T denotes vector (matrix) transpose. Under model (2) we have Unfortunately we cannot obtain a closed-form solution for θ(t, q|Z), though it can be solved for numerically. In order to estimate θ(t, q|Z) we need estimates for both β and 0 (·). Consider that data are collected in the form of (X i , i , Z i ) for i = 1, . . . , n with n being the sample size. For these data, X i = min(T i , C i ) where T i is a failure time and C i is a censoring time; i = I (T i ≤ C i ); and Z i is a vector of covariates. Given Z i , T i , and C i are assumed to be independent. Note that N i (t) = I (X i ≤ t, δ i = 1) and Y i (t) = I (X i ≥ t). As in Lin and Ying [16], we can estimate β with We can also estimate the baseline cumulative hazard function 0 (·) with With the estimators of β and 0 (·|Z), we can estimate θ(t, q|Z) by θ(t, q|Z), which is the solution of Note that θ(t, q|Z) is only defined when 0 (τ ) where τ is the largest observed failure time. For θ(t, q|Z), we have the following asymptotic properties, summarized in Theorem 1: q|Z)} converges weakly to a zero-mean Gaussian process whose variance function at t can be estimated consistently by n V (t) where Details of Theorem 1's proof are in the Appendix. With Theorem 1, calculating pointwise confidence intervals is possible. We may simply take θ(t, q|Z) ± z * 1−α/2 V (t) for a 100(1 − α)% confidence interval. Moreover, we can use asymptotic results to obtain confidence bands as well. Specifically, we know that where W (t) is standard Brownian motion. We can therefore solve for c and compute 100(1 − α)% confidence bands using θ(t, q|Z) ± c V (τ ).
In addition, we can compute both confidence bands and intervals using either bootstrap or simulation methods. Using the bootstrap method, we simply find the 2.5th percentile and 97.5th percentile of the bootstrap sample estimates of θ * (t, q|Z) at each time point in order to get pointwise intervals (Efron and Tibshirani [6]). To obtain bands, we first calculate the maximum deviation within each bootstrap sample across time, sup 0≤t≤τ | θ * (t, q|Z)− θ(t, q|Z)|, and then find the 95th percentile across samples, c. We can then construct bands using θ(t, q|Z) ± c.
The simulation method is in fact another way of resampling (Parzen et al. [18]). In the formulation of . For each simulated sample, all G i are randomly generated and we compute the estimated residual time quantile for that sample, θ(t, q|Z). At each time point, the 2.5th percentile and 97.5th percentile of the deviations θ(t, q|Z)−θ(t, q|Z) are calculated and added to the estimate θ(t, q|Z) to get lower and upper pointwise confidence intervals, respectively. Calculating bands is the same as with the bootstrap: we find the maximum deviation within each simulated sample across time, sup 0≤t≤τ | θ(t, q|Z) − θ(t, q|Z)|, and then find the 95th percentile across samples, c. We can then construct bands using θ(t, q|Z) ± c. One limitation of this type of confidence bound is that it is often too conservative in reality (Hall et al. [7] and Li et al. [14]).

Comparing Residual Time Quantiles
While being able to estimate covariate-specific residual time quantiles and their variance is useful, in most practical applications it is also important to be able to carry out comparisons between different covariate values and perform formal tests to determine if any observed difference is statistically significant. We may also be interested in formally comparing residual times at different fixed time points or for different quantiles. All of these tasks require being able to estimate the covariance between two different residual time quantiles.
We can extend results from Theorem 1 to establish asymptotic properties for the differences between the quantiles of residual time for different sets of covariate values, evaluation times, and quantiles. These properties are summarized in Theorem 2.

Theorem 2 Assume that conditions A-C [1] hold. Denote θ(t, q|Z) to be the solution
then as n → ∞, for a given Z 1 and Z 2 and fixed t 1 , t 2 , q 1 , and q 2 , )} converges weakly to a zero-mean Gaussian process and whose variance function at t can be estimated consistently by n W (t) where Details of Theorem 2's proof are in the Appendix. With Theorem 2, calculating pointwise confidence intervals is possible. We may simply take { θ(t 1 , Methods for calculating confidence bands and other methods for calculating confidence intervals are similar to those explained above.
Wald-type statistics Or we can obtain the asymptotic variance of θ(t 1 , q 1 |Z 1 )/ θ(t 2 , q 2 |Z 2 ) as below by Delta method similar to Lin et al. [15]: and then perform the test by

Simulations
In order to demonstrate the asymptotic performance of our estimators θ(t, q|Z) (that it is consistent and asymptotically unbiased) and V (t) (that it is consistent, asymptotically unbiased, and yields confidence intervals with the correct coverage probabilities), we conducted a simulation study. Survival times were generated under the assumption of additive hazards assuming that λ(t|Z) = t + β T Z with β = (1, 2) T and Z = (Z 1 , Z 2 ) T , where Z 1 ∼ Bernoulli(0.5) and Z 2 ∼ Uniform(0, 1). When generating censored data, censoring times followed an exponential distribution with rate parameter 0.195 (∼10% censoring) or 0.72 (∼30% censoring). For the purposes of simulation, we considered median (q = 0.5) residual time at t = 0.25 and t = 0.75. Covariate values of z = (0, 0.5) T were chosen. We used 10,000 replications in our simulation. Calculated quantities included the bias, sample standard error (i.e., the standard error of the MRT estimates), mean standard error (i.e., the mean of the standard error estimates), and the coverage probability for nominal 95% confidence intervals.
Simulation results can be seen in Table 1. Bias appears to be negligible. Sample standard errors (SSEs) and mean standard errors (MSEs) were very close to each other, regardless of sample size. Coverage probabilities closely matched nominal confidence levels.
Results for the performance of W (t) can be seen in Table 2. While bias was negligible, mean standard errors (MSEs) were larger than sample standard errors (SSEs) for all combinations of simulation parameters. This resulted in slightly con-  servative confidence intervals, with coverage probabilities a bit larger than nominal levels.

Real Data Examples
We present two examples of analysis on existing datasets. For the first, we use data from a clinical trial in the treatment of carcinoma of the oropharynx, also known as the VA Lung Cancer Trial, as presented in Kalbfleisch and Prentice [11]. This dataset includes 195 patients with squamous cell carcinoma of 3 sites in the oropharynx from the 6 largest of 16 total participating institutions. Patients were randomly assigned treatment with radiation therapy alone or radiation therapy with a chemotherapeutic agent. While many other characteristics were collected, the dataset we examined included only sex; treatment; tumor grade, site, T staging, and N staging; overall  Table 3 and reflect what we see when plotting MRTs (Fig. 1): higher T stage is associated with increased hazard of death and lower MRT. This association is not statistically significant, however, as Fig. 2 shows. The confidence bands (and usually the intervals) contain the null value 0 for the differences between any pair of T stages. Nevertheless, our results highlight a benefit of examining MRTs: for patients with T staging of I or II, MRT changes markedly over time, increasing sharply after initially decreasing slightly. For example, patients with T stage I or II have a MRT of 1.73 years (95% CI 0.30-3.15 years) after having survived 0.75 years and a MRT of 3.29 years (95% CI 0-9.29 years) after having survived 1 year. This would be important and welcome news to surviving patients and their caregivers alike. Here, we noticed that the confidence band is very wide due to the large variability at boundary.
Our second example uses data collected as part of the HIVNET 012 randomized trial (Jackson et al. [8]). This trial randomly assigned Human Immunodeficiency Virus type-1 (HIV-1) infected pregnant women in Kampala, Uganda to either a nevirapine-or zidovudine-based treatment. Their infants were followed and tested at pre-determined intervals for HIV-1. Data were also collected on adverse events through 6-8 weeks postpartum for mothers and 18 months for babies. The study enrolled 645 mothers: 313 assigned to nevirapine, 313 to zidovudine, and 19 to placebo. Within 18 months, 109 serious adverse events and 34 deaths were observed in the nevirapine group while 97 serious adverse events and 42 deaths were observed in the zidovudine group.
For our example, we examined the relationship between the 10th percentile of residual time to death or serious adverse event and treatment group (nevirapine versus zidovudine), maternal CD4 at pre-entry, and maternal HIV-1 RNA at baseline. We present results from the additive hazard model in Table 3. In the additive model,  Fig. 3 Plots of the 10th percentile residual times for two assumed infants for the HIVNET 012 data. One infant is assumed to have the mother treated with Zidovudine (AZT), having a maternal CD4 of 600 cells/µL, and a log 10 maternal viral load (VL) of 3.5 copies/mL. The other infant is assumed to have the mother treated with Nevirapine (NVP), having a maternal CD4 of 300 cells/µL, and a log 10 maternal VL of 5 copies/mL. The solid curve represents estimated 10th percentile residual time, the dashed curves represent the pointwise 95% confidence intervals, and the dotted curves represent the 95% confidence bands only maternal HIV-1 RNA was significantly associated with time to death or serious adverse event, with a coefficient of 0.10 (95% CI 0.04-0.17) for a unit increase of log 10 HIV-1 RNA copies/mL. A plot of 10th percentile residual times for two different combinations of covariate values (an assumed infant whose mother was treated with AZT, had a maternal CD4 of 600 cells/µL, and a log 10 maternal viral load of 3.5 copies/mL and another assumed infant whose mother was treated with NVP, had a maternal CD4 of 300 cells/µL, and a log 10 maternal viral load of 5 copies/mL) can be seen in Fig. 3. These covariate combinations were chosen to represent infants with relatively protective or relatively nonprotective characteristics. For both groups, the 10th percentile residual times are fairly stable at early follow-up times (minus an immediate jump at the beginning of evaluation times). As expected, the 10th percentile residual times for the infants with protective characteristics were consistently higher than those for the infants with nonprotective characteristics, though they were also somewhat more variable. When examining the difference in 10th percentile residual times between the two different combinations of covariate values, we conclude that it is not significant across all time as 0 is well within the limits of the confidence bands (see Fig. 4). This is in spite of the fact that, for the majority of fixed time points, the difference is statistically significant. The lack of overall significance is driven to a large degree by the large increase in variance towards later times. Plots of differences in the 10th percentile residual times between the two assumed infants for the HIVNET 012 data with pointwise confidence intervals. The solid curves represent estimated 10th percentile residual time, the dashed curves represent the pointwise 95% confidence intervals, and the dotted curves represent the 95% confidence bands

Discussion
In this article, we presented a model-based estimation method of residual time quantiles for censored time-to-event outcomes. Besides residual time quantiles, mean residual time might also be of interest. It is easy to obtain an explicit form for such estimator based on β and 0 (t) as below However, if the last observation is not an event, the estimator goes to infinity which indicates that we need specific parametric distribution to the tail. Similarly, for our residual time quantile estimator, we are limited by the number (or proportion) of events that are actually observed. The estimator is only available when where τ is the largest observed failure time. Thus our estimator may not be calculable at later times, for smaller quantiles, for particularly low-risk patient characteristics, or some combination of the three. Earlier censoring can offset this somewhat, though at the cost of increased variability.
As discussed earlier, in clinical practice knowing individuals' covariate-specific residual time quantiles shall assist both clinicians and patients to better understand how individually carried risk factors affect an event time such as cancer survival. The model-based estimation of residual time quantiles that we propose harnesses the structure of the additive hazards model to provide a simple approach to estimating residual time quantiles with easy-to-compute variances.
As with the additive hazard model-based estimator, our estimator would be greatly improved by being able to handle time-varying covariates and time-varying effects. While the additive hazards model is semiparametric and therefore allows a great deal of flexibility, possible model mis-specification can still lead to biased estimate of residual time quantiles. Therefore, certain sensitivity analyses and goodness-of-fit test may be used to help detect such a bias and further provide guidance on alternative models for more accurate and reliable model-based estimation.
As a limitation of the additive model, it is not recommended to calculate residual time quantiles for (combinations of) covariate values that are not observed in the data since the estimated hazard function might not be positive for those covariates' value. Furthermore, our method requires specification of a specific quantile and evaluation time as well as specific covariate values. While we have addressed the issue of comparisons at multiple time points by detailing how to compute confidence bands, the issue of comparisons for multiple covariate values or quantiles remains. estimators and the application of the continuous mapping theorem. Therefore, we need to only consider the asymptotic variance function in detail.
We begin with two equations: one based on the true values, and one based on the estimated values, Taking the differences between the left-and right-hand sides of both equations yields Examining the left-hand side first, given the consistency results of θ(t, q|Z), using Taylor's expansion, we have We therefore have the expression Rearranging yields the approximation (t, q|Z)).
From Lee and Hyun [13], we know that , so we can rewrite the overall approximation as We can also perform a substitution, setting u = t + v, and integrating with respect to v. This yields the approximation q|Z)).
Applying the martingale central limit theorem as in Lin and Ying [16] and Lee and Hyun [13], it follows that the variance function can be consistently estimated by 2 ))} converges weakly to a zero-mean Gaussian process follows directly from the established convergence of the individual estimators and the application of the continuous mapping theorem. Therefore, we need to only consider the asymptotic variance function in detail.
From the proof of Theorem 1, we have (1).