Comment on “Analysis of Longitudinal Trials With Protocol Deviations: A Framework for Relevant, Accessible Assumptions, and Inference via Multiple Imputation,” by Carpenter, Roger, and Kenward

Carpenter et al. (2013) propose a multiple imputation (MI) approach for analyzing data from clinical trials with protocol deviations. Sensitivity analysis to departures from missing at random (MAR) is widely acknowledged as important, but is poorly handled in practice, so we welcome their detailed proposals. However, here we highlight two problems with their method: an implicit assumption of noninformative deviation, and failure of the Rubin’s Rule (RR) variance estimator.

T corresponding to Var(Y O |T) and Cov Y * M , Y O |T be denoted T,OO and T,MO , respectively. Carpenter et al. denoted r,OO , r,MO , a,OO , and a,MO as, respectively, R 11 , R 21 , A 11 , and A 21 . A noninformative prior is assumed for μ r , μ a , r , a and its posterior is obtained under the assumption that the missingness mechanism is ignorable.
Under the assumption of "randomized-arm MAR," the posterior predictive distribution of the actual postdeviation outcomes Y M is the same as that of Y * M , so can be multiply imputed using this distribution. Therefore, as described by Carpenter et al., imputation under "randomized-arm MAR" is done by sampling a value of μ r , μ a , r , a from its posterior and then sampling Y M from a normal distribution with mean μ T,M + T,MO −1 T,OO Yo − μ T,O and variance given by Carpenter et al. As an addition to this established MI procedure for randomized-arm MAR, Carpenter et al. propose four novel MI procedures for MNAR data. These procedures differ from that described for randomized-arm MAR in the mean and variance of the normal distribution from which Y M is sampled. For "jump to reference," the mean is μ r,M + r,MO Letθ q denote the treatment effect estimate from the qth imputed dataset (q = 1, . . . , Q), and Var θ q be its variance estimate. The Q effect estimates are combined into an overall estimateθ (Q) using RR for the mean:θ (Q) = Q −1 Q q=1θ q . RR for the variance gives an estimate of the repeated sampling variance ofθ (Q)

PROBLEM 1: INFORMATIVE DEVIATIONS
The first problem with the procedures proposed by Carpenter et al. is that they make an implicit "noninformative deviation" assumption, P (D = t|D ≥ t, T, Y) = P (D = t|D ≥ t, T, Y 1 , . . . , Y D ), that is, that the hazard of deviation does not depend on later outcomes given earlier outcomes. For simplicity of exposition, suppose J = 2, there are no deviations in the reference group, and outcomes at different times are independent and the imputer knows this (however, the problem we now describe applies more generally). Under the "jump to reference" and "copy reference" assumptions, the mean of the imputation distribution of postdeviation Y 2 given deviation is μ r,2 , which is the unconditional expected outcome in a randomly sampled untreated patient. This is a reasonable assumption if the factors influencing deviation are independent of those influencing Y 2 . However, this will often not be the case. The following example illustrates what happens when deviation is informative.
For each patient, let D * denote the (possibly counterfactual) time that the patient would have deviated had she/he been randomized to the active group. Thus, D * = D if T = a and is missing if T = r. Suppose that E (Y 2 |D * , T) = α + βI (D * = 1). Thus, treatment has no effect on outcome, but outcomes of patients who deviate are, on average, greater by β than those who do not. Assume deviation is informative, that is, β = 0. Let π = P (D * = 1) > 0. The expected mean of the imputation distribution for postdeviation outcomes is μ r,2 = E (Y 2 |T) = α + βπ, which is different from the true mean E (Y 2 |D * = 1, T) = α + β. Therefore, in the imputed data set the mean of Y 2 in the active group has expectation π (α + βπ) + (1 − π ) α = α + βπ 2 . This is different from α + βπ, the expected mean in the reference group, so the treatment effect estimate is biased away from zero. Similar considerations apply in the case of "copy increments in reference" and LMCF.

PROBLEM 2: USE OF THE RUBIN'S RULE VARIANCE ESTIMATOR
The second problem is that the Rubin's Rule (RR) estimator of the repeated sampling variance ofθ (Q) may not be valid unless the data are "randomized-arm MAR" and MI is carried out assuming this. This is because under the other missingness assumptions ("jump to reference" etc.), the imputer assumes more than the analyst, which is known to cause the RR variance estimator to overestimate the repeated sampling variance (Meng, 1994). The following extreme example illustrates this.
Assume noninformative deviation (so Problem 1 does not apply), J = 2, no deviation in the reference group, all patients in the active arm deviate at time 1 (D = 1), and outcomes at different times are independent and the imputer knows this. Suppose the treatment effect of interest is θ = E (Y 2 |T = a) − E (Y 2 |T = r) and the complete-data estimator of this effect is just the difference between the sample means in the two arms. The posterior of μ r,2 is normal with mean equal to the sample mean of Y 2 in the reference arm. Therefore, under "jump to reference" or "copy reference,"θ q is normally distributed with mean zero. Consequently,θ (∞) = 0 and the repeated sampling variance ofθ (∞) equals zero. On the other hand, B ∞ and hence Var θ (Q) are both positive. The variance estimator is overestimating the true variance because the data are imputed under a strong assumption that is no longer made when these imputed data are analyzed, specifically, that there is no treatment effect in those who deviate.
More generally in the four MNAR imputation procedures, the imputer (but not the analyst) assumes a relation between the expected postdeviation outcomes of an individual in the active arm given that the individual deviates and the expected outcomes of an individual in the reference arm. This enables the imputer to use data from the reference arm when imputing postdeviation outcomes in the active arm. In "randomized-arm MAR" imputation, on the other hand, the imputer does not assume a relation between outcomes in the two arms, and imputes postdeviation outcomes in the active arm using only the observed data from the active arm.
To illustrate that the RR variance estimator can be positively biased in less extreme cases than that just considered, we carried out a simulation study. We considered a trial with J = 4, n = 200, and P (T = r) = P (T = a) = 0.5. Patients in the active arm deviated (noninformatively) at time 2 (D = 2) with probability 0.2; otherwise, they did not deviate (D = 4). There was no deviation in the reference arm. The treatment effect of interest was θ = E (Y 4 |T = a) − E (Y 4 |T = r). For each nondeviating patient in arm T, outcome vector (Y 1 , Y 2 , Y 3 , Y 4 ) was generated from a normal distribution with mean μ T and variance T . We used the same mean and variance as in Lu (2014). Specifically, μ r = μ a = (29, 22, 17, 14) T for a "no-treatment effect" scenario, and μ a = (29, 20, 14, 11) T and μ r = (29, 22, 17, 14) T for a "treatment effect" scenario. For both scenarios, the (j, k)th entry of a = r was 36 × (1 − 0.2 × |k − j|). For deviating patients, (Y 1 , Y 2 , Y 3 , Y 4 ) was also generated from a normal distribution but with mean and variance depending on the assumed imputation procedure. For example, in the "treatment effect" scenario, the mean and variance were μ r and r for the "copy reference" procedure, but (29,22,22,22) and a for the Note. θ is true treatment effect; meanθ comp is average of complete-data estimates of θ (maximum Monte Carlo standard error = 0.0086); meanθ (Q) is average of RR treatment effect estimates (max MCSE = 0.0084); SE θ comp is empirical standard error of complete-data estimates (max MCSE = 0.0061); sqrt mean Var θ (Q) is square root of the average RR estimate of the variance (max MCSE = 0.0005); SE θ (Q) is empirical standard error of RR estimate (max MCSE = 0.0060); RR cover is coverage of 95% confidence interval from Rubin's Rules (max MCSE = 0.0022).
LMCF procedure. Table 1 shows the true values of θ . Note that for the LMCF imputation procedure, θ = 0 even when μ a = μ r (the "no treatment effect" scenario).
For each of the two treatment effect scenarios and Carpenter's five imputation procedures, 10,000 data sets were generated. The standard analysis of covariance (ANCOVA) estimator was first applied to each complete data set, yielding the complete-data estimator θ comp . Postdeviation outcomes were then discarded and Q = 1000 imputed data sets were created using the correct imputation procedure (i.e., that assumed when generating the complete data). The ANCOVA estimator was applied to each of these Q imputed data sets, and estimates and standard errors were combined using Rubin's Rules, yieldingθ (Q) and SE θ (Q) . The norm package in R (Schafer, 2012) was used to draw from the posteriors of (μ r , r ) and μ a , a . Table 1 shows the results. These demonstrate that the RR estimate of the standard error of the treatment effect overestimates the true standard error for the "copy reference," "jump to reference," and "copy increments in reference" procedures. This mirrors findings for the alternative placebo-based pattern mixture model approach presented in Lu (2014). The RR estimator achieves coverage at close to the nominal rate for the LMCF procedure. While conservative variance estimates may sometimes be viewed as desirable, our simulation study highlights another issue with the Carpenter et al. imputation procedures: they yield smaller empirical standard errors than the estimator based on the complete data. This reflects the strength of the assumption being made by the imputer.

CONCLUSION
While we welcome the Carpenter et al. proposals, we are concerned that they may cause bias when deviations are informative (Problem 1). Methods from the causal inference literature (White, 2005) may be helpful to avoid such bias. Problem 2 may be of less practical importance if the reduction in variance caused by making a highly informative assumption like "jump to reference" is unwanted. If this is so, the positive bias in the RR variance estimator may balance this reduction, thus yielding a variance estimate that better reflects the real uncertainty. However, it is not clear how this estimate should be interpreted in terms of repeated sampling. Alternatively, one could seek a different variance estimator, for example, using the general methodology of Robins and Wang (2000). Lu (2014) used the delta method to derive a variance estimator that is consistent under an assumption somewhat similar to "copy reference." He also derived a related Bayesian estimator.