Estimating within-study covariances in multivariate meta-analysis with multiple outcomes

Multivariate meta-analysis allows the joint synthesis of effect estimates based on multiple outcomes from multiple studies, accounting for the potential correlations among them. However, standard methods for multivariate meta-analysis for multiple outcomes are restricted to problems where the within-study correlation is known or where individual participant data are available. This paper proposes an approach to approximating the within-study covariances based on information about likely correlations between underlying outcomes. We developed methods for both continuous and dichotomous data and for combinations of the two types. An application to a meta-analysis of treatments for stroke illustrates the use of the approximated covariance in multivariate meta-analysis with correlated outcomes. Copyright © 2012 John Wiley & Sons, Ltd.

A2. Derivation of equation (9) We now derive equation (9), the covariance term between probabilities for two outcomes, given that outcome 1 is nested within outcome 2. We illustrate the covariance term for the treatment group; the covariance term for the control group can be derived in a similar way.
Suppose we observe 1t S out of N t participants with outcome 1, and 2t S out of N t participants with outcome 2. Since outcome 1 is nested within outcome 2, we have  Typically this situation will be associated with 2 1 t t n n  , since information on event 'A or 1 t t N n  . For the 1 2  t t n n participants for whom we do not know their outcomes, we assume they are missing at random and assign a proportion     This yields a revised value of a t , say S S a a n n n S .

A3 Derivation of equation (12)
Now we derive equation (12), the covariance between two inverse sample variances for two continuous outcomes within a study. We use the properties of the bivariate chisquared distribution to assist this deviation.

Bivariate chi-squared distribution
Let 1 2 ( , ) Z Z be a two dimensional correlated random vector with mean 1 2 ( , ) T

B1 Simulation procedures
Overview We simulate meta-analysis data for two outcomes, with outcome 1 being a continuous variable and outcome 2 being a dichotomous variable. The treatment effects are measured using a mean difference and a log odds ratio for the two outcomes respectively. To induce correlation between the outcomes within studies, we simulate individual participant outcomes from a bivariate normal distribution and dichotomize the second variable. The correlations between the continuous and dichotomous outcomes, and between the treatment effect estimates for the two outcomes, are estimated empirically.
Simulation parameter specification We consider a wide range of sample sizes to assess the dependence of estimation properties on the number of studies ( s N ), the number of participants in treatment ( t N ) and control groups ( c N ), as well as the degree of dependence between outcomes. We approximately mimic the SBP and DBP outcomes in the acute stroke data by setting the between-study parameters as between-study correlation ( b  ), setting each to be either zero or strong (0.9). We perform simulations under a scenario in which the within-study treatment effects are the same for every study, and for a scenario in which within-study treatment effect variances vary across studies. We achieve the latter by keeping the between-participant outcome variances the same, and varying the sample sizes within studies. The full set of scenarios considered is provided in Supplementary Table B1.
Outcome data generation We start by simulating true treatment effects for two continuous outcomes using a bivariate normal distribution: with parameter values as specified above. Within each study, we simulate individual outcome data for each participant i, centering the control group participants on the observed control group mean across acute stroke trials for outcome 1 (the means and standard deviations for outcome 2 are arbitrary). The treatment group means are obtained by applying the simulated treatment effect for the study: We then dichotomize each 2tsi y and 2csi y using cut-points t c and c c which are chosen so  Results Summary statistics from the simulations are given in Supplementary Table B2 -B4. Our first observation is that treatment effects are well estimated by all methods, and there is little difference between univariate and multivariate approaches. We note that multivariate meta-analysis is most likely to offer advantages over a univariate approach when there are non-ignorable missing data (Kirkham et al. 2012). Our simulation study used complete case data and did not address this issue.
Comparing results across Tables B2-B4 shows the impact of the extent of heterogeneity for outcome two, with 2 2 2 2 0, 0.1 , 0.2   , respectively. Table B2 shows that when 2 2 0   , the multivariate approach reduces bias in estimating 1  but increase bias for 2  . It is therefore not clear whether a multivariate approach is better than a univariate approach when there is heterogeneity in one outcome but not in the other. However, when 2  slightly departs from zero, improvement in estimates for between-study variance is evident in scenario 13-16,18,21-22 in Table B3. If we increase the 2  to 0.2 2 , there are consistent improvements in parameters estimates for between-study variance for both outcomes through all scenarios in Table B4. This suggests that a multivariate approach is likely to outperform a univariate approach when there is heterogeneity for both outcomes, but not necessarily otherwise.
We turn now to the impact of the magnitude of the correlation. Strong within-study correlation is assumed in scenarios 13-20 and 25-32. In these situations, both UM and MM(0) are misspecified models with respect to the within-study correlation. MM( e  ) assumes a common correlation between treatment effects and MM( o  ) assumes a common correlation between outcomes, using our formulae to approximate the withinstudy covariance. We expect that UM and MM(0) will be worse than the latter two with misspecification of the within-study correlation; while MM( o  ) should be similar to MM( e  ). We observe that bias for 2 11  in UM and bias for 2 22  in MM(0) are inflated in scenarios 13-20 and 25-32. In Table B3, the bias for between-study correlation b  in MM(0) is sometimes inflated, particularly in scenarios 13-14, and 17-18 where the between-study correlation is zero; the biases are 0.78, 0.92, 0.85 and 0.66, respectively.
These biases appear to be serious, given the parameter space [-1, 1] for correlation coefficients. Similarly findings are given in scenarios 25-26, and 29-30. These indicate that when correlation is strong (weak) for within-(between-) study, then assuming zero within-study correlation will introduce bias into estimates for between-study correlation.
This is one other situation in which correct specification or approximation of withinstudy correlation is important.
Estimation from MM( o  ) is generally not worse than MM( e  ). The particular situation in which we expect MM( o  ) to perform better is when sample sizes vary across studies (the even-numbered scenarios), since the covariances between treatment effects then vary across studies even if the covariances between outcomes remain the same. We observe such a pattern for situations in which between-study correlation is low, but not when between-study correlation is high (which might be explained in part by our relatively small underlying heterogeneity variance for the second outcome, which does not allow the high between-study correlation to manifest itself).
Finally, Tables B3 and B4 show that, if there are large number of studies (n=50), the bias in the between-study variance estimate is reduced by using a multivariate approach, and the bias in the between-study correlation estimate is minimized in our proposed approach. However, when there are a few studies (n=10), no clear improvement are observed in the multivariate approach over the univariate approach, unless the within-(between-) study correlations are high (low).  (0): multivariate meta-analyses; MM0-assuming zero as within-study correlations; MM( e r )-assuming common non-zero within-study correlations between treatment effects; MM( o r )-assuming common non-zero within-study correlations between outcomes. t n and c n number of participants in treatment group and control group, respectively; s n number of studies; 1  and 2  between-study standard deviation for treatment group and control group, respectively; b  between-study correlation coefficients for overall effects; w  within-study correlation coefficients for outcomes (before dichotomized); t p and c p event rates in treatment and control group, respectively.  (0): multivariate meta-analyses; MM0-assuming zero as within-study correlations; MM( e r )-assuming common non-zero within-study correlations between treatment effects; MM( o r )-assuming common non-zero within-study correlations between outcomes. t n and c n number of participants in treatment group and control group, respectively; s n number of studies; 1  and 2  between-study standard deviation for treatment group and control group, respectively; b  between-study correlation coefficients for overall effects; w  within-study correlation coefficients for outcomes (before dichotomized); t p and c p event rates in treatment and control group, respectively.  (0): multivariate meta-analyses; MM0-assuming zero as within-study correlations; MM( e r )-assuming common non-zero within-study correlations between treatment effects; MM( o r )-assuming common non-zero within-study correlations between outcomes. t n and c n number of participants in treatment group and control group, respectively; s n number of studies; 1  and 2  between-study standard deviation for treatment group and control group, respectively; b  between-study correlation coefficients for overall effects; w  within-study correlation coefficients for outcomes (before dichotomized); t p and c p event rates in treatment and control group, respectively.