Measures for concordance and discordance with applications in disease control and prevention

Bivariate binary response data appear in many applications. Interest goes most often to a parameterization of the joint probabilities in terms of the marginal success probabilities in combination with a measure for association, most often being the odds ratio. Using, for example, the bivariate Dale model, these parameters can be modelled as function of covariates. But the odds ratio and other measures for association are not always measuring the (joint) characteristic of interest. Agreement, concordance, and synchrony are in general facets of the joint distribution distinct from association, and the odds ratio as in the bivariate Dale model can be replaced by such an alternative measure. Here, we focus on the so-called conditional synchrony measure. But, as indicated by several authors, such a switch of parameter might lead to a parameterization that does not always lead to a permissible joint bivariate distribution. In this contribution, we propose a new parameterization in which the marginal success probabilities are replaced by other conditional probabilities as well. The new parameters, one homogeneity parameter and two synchrony/discordance parameters, guarantee that the joint distribution is always permissible. Moreover, having a very natural interpretation, they are of interest on their own. The applicability and interpretation of the new parameterization is shown for three interesting settings: quantifying HIV serodiscordance among couples in Mozambique, concordance in the infection status of two related viruses, and the diagnostic performance of an index test in the field of major depression disorders.


Introduction
In medical applications as well as in other fields, it is often of interest to examine the ''resemblance'' between two or more observations from paired or matched outcomes. Here the focus is on two binary paired or matched outcomes. Examples considered in this paper are the HIV status among couples; the infection statuses for the same individual for both the Varicella-Zoster Virus and the Parvo B19-virus, viruses that are similar in their transmission being close contact; and the diagnostic performance of the Whooley questions as a screening tool for depression amongst older adults in primary care. Resemblance can be measured in different ways, depending on the characteristic of interest. It could be represented by an association parameter, such as the Pearson productmoment correlation or, for binary data, by the cross-product ratio or odds ratio. However, often association is not of interest but rather agreement. Agreement and association are in general distinct facets of the joint distribution. Strong agreement requires strong association, but strong association can exist without strong agreement. 1 A wellknown measure for agreement is Cohen's kappa, see Agresti 1 for extensions and ways to model agreement.
Measures for association and agreement are typically symmetric and can be misleading if one of the agreeing outcomes is very dominant, such as the (negative, negative) combination in our first example of the HIV status among couples. Indeed, as the majority of pairs agree in being negative, symmetric measures of association or agreement might be high even if there is only a small number of agreeing positive pairs. In the context of measuring synchrony in neuronal firing, Faes et al. 2 proposed a new measure of synchrony, the conditional synchrony measure (CSM), which is the probability of two neurons firing together, given that at least one of the two is active. Faes et al. state that, although the odds ratio is an attractive association measure with nice mathematical properties (such as the absence of range restrictions, regardless of the marginal probabilities), it is less suitable for quantifying synchrony, due to its symmetry treating 0-0 matches of equal importance as 1-1 matches. Similar to the CSM but being rather interested in discordance, Juga et al. 3 defined the HIV conditional (sero)discordance measure (CDM) as the conditional probability that the couple is HIV discordant, given that at least one of them, man or woman, is HIV positive.
As noted by Faes et al. and Juga et al., a reparameterization of the joint bivariate binary distribution in terms of the marginal ''success'' probabilities and with the OR parameter replaced by the CSM (or the CDM) does not lead to a permissible joint distribution for the full ranges of all parameters, as the Fre´chet bounds can be violated. 4 This puts constraints on the parameters of the joint bivariate distribution which are difficult to translate to the regression parameters when introducing dependency of the parameters on risk factors and other covariates. Moreover, the constraints hinder fitting the models, leading to computational issues such as convergence problems. The objective of this paper is to solve this non-permissibility problem, and to introduce an alternative parameterization guaranteeing a permissible distribution for all combination of values. The alternative parameterization, no longer including the marginal success probabilities, is shown to be of interest on its own, and offers additional insights for particular applications.
In the next section, three settings with illustrative datasets are introduced. Then the new measures and the new parameterization are presented and covariate models for the different parameters and maximum likelihood inference is briefly described. The illustrative datasets are analyzed using the new parameterization and the paper ends with final conclusions, considerations and ideas about further research.

Applications and datasets
In the following sections, three different settings and specific data examples are introduced in the field of disease control and prevention.

HIV serodiscordance among couples in Mozambique
We consider the same setting as in Juga et al., 3 based on data from the 2009 National Survey of Prevalence, Risk Behavioural and Information about HIV and AIDS (INSIDA 5 ). This survey was a cross-sectional two-stage survey, carried out by the National Institute of Health in collaboration with the National Bureau of Statistics of Mozambique. The objective is to model HIV serodiscordance among couples as a function of different risk factors and other covariates.
Let y ij ¼ ð y ij1 , y ij2 Þ denote the HIV status (1 if positive, 0 if negative) of a (female, male)-couple j ¼ 1, . . . , n i in enumeration area (EA) i with n i sampled couples, i ¼ 1, . . . , N. For the INSIDA data, the total number of EAs is N ¼ 270 and P N i¼1 n i ¼ 2159 is the total number of couples. Expressing that the covariates x 1,ij , x 2,ij and x 3,ij can be possibly different subvectors of the full covariate/factor vector x ij , Juga et al. 3 fitted several joint models for both marginal probabilities to be HIV positive, complemented with a new conditional (sero)discordance measure CDM (defined in Section 3): where are distributed as a trivariate normal distribution with mean zero-vector and covariance matrix AE. Note that x ',ij (' ¼ 1, 2, 3) are vectors of covariates, and some of these covariates are specific for the male/female individual, some specific for the couple, and some are at the level of the province. They considered different choices for the covariance matrix AE ¼ VRV (including full/partial correlated, full/partial shared, full/partial equal, independent). From their final model, Juga et al. 3 concluded that the HIV prevalence for the province where a couple was located as well as the union number for the woman within a couple is a factor associated with HIV serodiscordance.
As will be discussed in Section 3, the parameterization with both HIV marginal probabilities and the conditional discordance measure CDM does not satisfy the Fre´chet inequalities, 4 causing computational difficulties with some of the models. In Section 5.1 we will reanalyze these data with the same type of models, but based on our new parameterization as introduced in Section 3.2.

Varicella Zoster Virus and Parvo B19 concordance
The Varicella-Zoster Virus (VZV) and the Parvo B19-virus (B19) are similar in that transmission occurs during close contacts. The contact rate and the infectiousness of the pathogen determine the spread of the infection in a population. It has been shown that the contact rate depends on age through heterogeneity in mixing of individuals from different age-classes. Several approaches have been proposed to model multi-sera data. Hens et al. 6 used a marginal model (bivariate Dale model 7 with odds ratio as association parameter) and conditional models (modelling one infection status conditional on the other) to model the multi-sera VZV-B19 data from Belgium. Hens et al. 8 studied the behaviour of the bivariate-correlated gamma frailty model for cross-sectionally collected serological data on Hepatitis A and B.
Here we reanalyze the Belgian VZV-B19 serological data. In a period from November 2001 until March 2003, 2381 serum samples in Belgium were collected and consecutively tested for VZV and B19. 9 Together with the test result for VZV and B19, gender and age of the individuals were recorded. Samples from children under 6 months were omitted, as test results are driven by maternal antibodies in this early stage of life. The maximum age of 40 was fixed by design; it was considered not important to test for older ages given that it concerns childhood infections. Figure 1 depicts the bivarate distribution of VZV and B19 as a function of age. In Section 5.2, we will propose and discuss new measures to provide other and new insights in the joint occurrence of both infections, as function of age and gender.

Diagnostic performance and concordance of the Whooley questions
Based on a cross-sectional validation study, conducted with 766 patients aged !75 from UK primary care and recruited via 17 general practices based in the North of England during the pilot phase of a randomized controlled trial, Bosanquet et al. 10 assessed the diagnostic performance of the Whooley questions (Whooley et al. 11 ) as a screening tool for major depression disorder (MDD) amongst older adults in UK primary care. Sensitivity, specificity, and likelihood ratios comparing the index test (two Whooley questions) for an MDD-diagnosis were ascertained by the reference standard Mini International Neuropsychiatric Interview (MINI 12 ). Participants completed a self-reported, written version of the index test, the Whooley questions: (WQ1) During the past month, have you often been bothered by feeling down, depressed, or hopeless? (yes ¼ 1/no ¼ 0); (WQ2) During the past month, have you often been bothered by little interest or pleasure in doing things? (yes ¼ 1/no ¼ 0). In the standard method of scoring the Whooley questions, participants who respond yes to at least one of the two questions were classified as screening positive for depression. Table 1 shows a 2 Â 2 table cross-classifying the index test (positive if being positive for at least one Whooley question) with MINI as the golden standard reference (GSR), as well as tables cross-classifying both Whooley questions against each other, unconditionally and conditional on the GSR status. Concordance between index and golden standard reference, and concordance and discordance between both Whooley questions are of interest and will be discussed in Section 5.3.

Measuring con(dis)cordance and (a)synchrony
First, we briefly review existing measures, including the conditional synchrony measure. Next the new parameterization is introduced, discussed and relations with other parameters are examined. A final section focuses on the estimation of the new parameters by maximum likelihood.

Existing measures
Consider a bivariate binary outcome y ¼ ð y 1 , y 2 Þ and a (possibly multivariate) covariate x and let  where 0 5 k' ðxÞ 5 1 and P 1 k,'¼0 k' ðxÞ ¼ 1, denote the conditional joint distribution of y given x (dependency of x suppressed from notation, if not relevant), with marginal conditional (success) probabilities 1þ ðxÞ ¼ 10 ðxÞ þ 11 ðxÞ and þ1 ðxÞ ¼ 01 ðxÞ þ 11 ðxÞ. Many parameterizations are theoretically possible, but in many cases interest goes in the effect of x on the marginal probabilities (most often with a logit link allowing an odds ratio interpretation), complemented with an association parameter, such as the correlation , (with a Fisher-z link) or the odds ratio ¼ ð 00 11 Þ=ð 01 10 Þ (with a log link).
But such association measures are not the target parameter of interest in case interest goes to con(dis)cordance or (a)synchrony. In the context of measuring synchrony in neuronal firing, Faes et al. 2 stated that the odds ratio is less suitable to quantify synchrony due to its symmetry, treating 0-0 matches of equal importance as 1-1 matches, and proposed a new measure of synchrony, the conditional synchrony measure CSM, defined as being the probability of two neurons firing together, given that at least one of the two is active. In order to model HIV serodiscordance among couples in Mozambique, Juga et al. 3 introduced the conditional (sero)discordance measure CDMðxÞ ¼ 1 À CSMðxÞ and showed that the CDM measure is a more direct and relevant measure to study the effects of risk factors. A limitation of this parameterization, with the marginal probabilities combined with CSMðxÞ or CDMðxÞ, is that not all values of 1þ ðxÞ, þ1 ðxÞ and of CSMðxÞ (or CDMðxÞ) result in a permissible joint distribution k' ðxÞ. Indeed, the Fre´chet bounds 4 need to hold: given values 0 5 1þ ðxÞ 5 1 and 0 5 þ1 ðxÞ 5 1, a permissible joint distribution is only obtained if max 1 À 1þ ðxÞ þ1 ðxÞ , 1 À þ1 ðxÞ 1þ ðxÞ CDMðxÞ minf 1þ ðxÞ þ þ1 ðxÞ, 1g ð 5Þ When modelling the dependency on x and possibly including additional random effect structures on all three parameters ( þ1 , 1þ and CDM or CSM) to account for additional heterogeneity, the constraints (equation (5)) may cause problems when fitting some models (computational issues, non-convergence,. . .).

New measures and new parameterization
While it is common to include the two marginal probabilities as part of a model parameterization, particular alternative parameters might be of interest too and might shed more light on the research questions at hand. In the three applications of interest, the focus is not on the marginal probabilities. The new parameterization proposed here abandons the common starting point of adding the parameter of interest to the marginal probabilities, but takes an opposite approach: next to the CSM or CDM measure, which other parameters of interest can be introduced to obtain a complete parameterization of the joint distribution?
As a first parameter, define the conditional probability that y 1 is positive, given that both disagree or, alternatively, 1 À being the probability that y 2 is positive in a disagreeing pair (y 1 , y 2 ). This parameter focuses on disagreeing pairs and the probability that one is more dominant than the other. Note that ¼ 0:5 implies symmetry ( k' ¼ 'k ) and hence marginal homogeneity ( 1þ ¼ þ1 ), and 4 0:5 implies that 1þ 4 þ1 . Actually is the central parameter in McNemar's (exact) test for matched pairs with, under the null hypothesis of marginal homogeneity, $ Binðn Ã , 0:5Þ with n* the total number of disagreeing pairs. 1 This implies that, although the two marginal success probabilities 1þ , þ1 are no longer model parameters, their equality 1þ ¼ þ1 can still be directly tested, while accounting for covariate effects. Actually, instead of the parameter , one could use the relative difference In the sequel, we will use definition (6) and refer to it as the (marginal) homogeneity parameter.
As a second parameter, define (as before) the ''positive'' conditional synchrony measure CSM, being the probability that both are agreeing (both positive) given that at least one is positive or, alternatively, the (positive) CDM, now denoted as þ ¼ 1 À þ . Finally, define the third parameter as the ''negative'' conditional synchrony measure, being the probability that both are agreeing (both negative), given that at most one is positive Again, the third parameter can also be defined as (negative) CDM, being À ¼ 1 À À . The homogeneity parameter determines the relative ratio of the off-diagonal probabilities of disagreement, independently of the values of þ and À , whereas the parameters þ , À (or alternatively þ , À ) focus on the diagonal probabilities of agreement, and The measure þ tends to 1 if and only if À does so. The parameters þ , À , þ , À are invariant for switching y 1 with y 2 , but will switch to 1 À . Depending on the application and the particular parameters of interest, one can opt for a particular combination, e.g. , þ and À . The joint probabilities k' can easily be expressed in terms of the new parameters, e.g. in terms of , þ and À It can also readily be shown that, for any combination of values for the three conditional probabilities 0 5 , À , þ 5 1, the Fre´chet bounds are satisfied, and we obtain a permissible joint distribution k' .
The marginal success probabilities can be written as 1þ ¼ ð1 À À Þð1 À ð1 À þ Þð1 À ÞÞ 1 À À þ , þ1 ¼ ð1 À À Þð1 À ð1 À þ ÞÞ 1 À À þ and the odds ratio as Identity (9) shows that the odds ratio decomposes in three factors, each related to one of the three new parameters. The association in terms of the odds ratio increases multiplicatively with the odds of both synchrony measures À and þ , and converges to infinity as the homogeneity parameter tends to 0 or 1. The minimal value of , for fixed values of À and þ , is obtained for ¼ 0:5, corresponding to marginal homogeneity. Of course, this factorization is not helping in characterizing independence, but independence is not of interest in our settings of interest.
A bit different and more ''asymmetric setting'' is that of measuring the accuracy of diagnostic tests. Assume y 1 represents the true disease status, and y 2 another alternative test. Sensitivity Se ¼ Pð y 2 ¼ 1j y 1 ¼ 1Þ and specificity Sp ¼ Pð y 2 ¼ 0j y 1 ¼ 0Þ relate to the new parameters as Se ¼ þ þ þ ð1 À þ Þ , Sp ¼ À À þ ð1 À À Þð1 À Þ and for the positive predictive value PPV ¼ Pð y 1 ¼ 1j y 2 ¼ 1Þ and the negative predictive value NPV ¼ Pð y 1 ¼ 0j First of all, note that Se and PPV do not depend on À and Sp and NPV do not on þ . As to be expected, Se and NPV decrease whereas Sp and PPV increase with the homogeneity parameter . If is very close to 1, Se % þ and Sp % 1. Furthermore, in that case, PPV % 1, and NPV % À . Similarly, if is very close to 0, Se % NPV % 1, Sp % À and PPV % þ . Se and PPV increase with þ and Sp and NPV increase with À . All of them tend to 1 whenever þ tends to 1 (and thus À tends to 1).
The diagnostic odds ratio is used as a measure of the effectiveness of a diagnostic or screening test. It is independent of prevalence, and it is a single indicator of test performance, ranging from zero to infinity and higher values (above 1) are indicative of better test performance. 13 Equation (10) decomposes the DOR in three factors: given the value of the marginal homogeneity parameter , higher values of the DOR correspond to higher values of one or both synchrony measures þ , À . Although our application as introduced in Section 2.3 is based on a cross-sectional study, allowing to estimate the disease prevalence by the case study prevalence, this might be not the case for other study designs. Consider for instance the case-control design, with data about a screening test result for the diseased and non-diseased subpopulation (as defined by a golden standard or reference test). Such a design allows the estimation of the sensitivity and specificity, but not the prevalence of the disease. The formulas with Prev ¼ 1þ the prevalence of the disease, show the dependency of the homogeneity parameter and both conditional synchrony measures on the disease prevalence. These formulas allow us to combine data with knowledge about the disease prevalence (from other data or literature) to estimate or to model the parameters , þ and À for a case-control study (typically in the Bayesian paradigm).
The dependency of the three conditional probabilities , þ and À (or any other eligible combination of interest from the sets f þ , þ g and f À , À g) on covariates can be modelled with three components where h 1 , h 2 and h 3 are link functions (logit, probit, cloglog,. . .). We will focus on the logit link as it allows a more appealing interpretation of covariate effects in terms of odds ratios. The model components (11) can be embedded in different frameworks of estimation and inference; we will opt for full maximum likelihood. Note that, when using the relative difference parameter 2 À 1, ranging from À1 to 1, the logit link is not appropriate but rather a Fisher-z link would be in order. The identity, with Á r ¼ 2 À 1 implies that models for , 1 À and Á r with respective links are identical (only opposite slopes for the models for 1 À ). Depending on the application at hand, one might be more interested in interpreting the estimates in terms of , 1 À or Á r . For the latter choice, a zero intercept would reflect marginal homogeneity for all covariate values equal to 0, and the effect of a covariate as represented by the estimated slope would reflect non-homogeneity in one or the other direction: a positive slope would indicate a higher marginal probability of success in the first variable, and a negative slope would indicate a higher marginal probability of success in the second variable.

Applications
In this section we revisit the three applications introduced in Section 2 and show how in each example model (11) can be formulated and we illustrate the use and interpretation of the three conditional probabilities , þ and À (or variations thereof). Data analyses were performed in SAS version 9.4 using PROC NLMIXED (exemplifying code in the supplemental material).

Modelling HIV serodiscordance among couples in Mozambique
We reanalyse the HIV data introduced in Section 2.1 based on model components for i) the (homogeneity) probability ij that the female partner of couple j in EA i is HIV positive, given that both partners differ in their HIV status; ii) the probability þij that only one is HIV positive, given that at least one of the two partners is positive (positive serodiscordance); and iii) the probability Àij that both are negative given that at most one of the two partners is positive (negative seroconcordance) trivariate normally distributed random EA-effects, with mean zero-vector and covariance matrix AE.
Most of the sample designs for household surveys such as INSIDA are complex and involve stratification, multistage sampling, and unequal sampling rates, and it is necessary to account for the particular survey design in the statistical analyses using appropriate weights. We followed the same approach as Juga et al. 3 For more details on the calculation of the weights as used in our analyses, we refer to Juga et al. 3 After following the same model building procedure as in Juga et al., the best fitting final model had no random EA-effect b ,i on parameter ij and correlated random EA-effects ðb þ ,i , b À ,i Þ on ð þij , Àij Þ. This model has an AIC value of 2509.2, considerably improving the fit of the best model in Juga et al. 3 (AIC ¼ 2571.8 for a partial-equal random effects type of model). All models converged, in contrast to the experiences of Juga et al., 3 who reported that the models with independent random effects and full or partial random effects did not convergence. Table 2 shows the estimates of the final model. The following covariates appear in the final model with fixed effects: 'HIV prevalence' is the prevalence of HIV at the couple's residence at the level of the province, categorized into three categories using cutpoints of 5% and 15% and with 0-5% as reference category; 'Union number woman' refers to whether the woman has been married or lived with a man once (reference category) or more than once; 'STI man' is yes if the man answered yes to any of three questions about symptoms of sexually transmitted infections STIs (no is reference); 'Condom use woman' refers to whether the female respondent of the couple used a condom the last time she had sexual intercourse with the other partner (yes is reference); 'Wealth index' refers to the economic status of the couple with three categories Poorer, Middle and Richer (reference). For more details, see Fishel et al. 14 Comparing our results with those of Juga et al. 3 and focusing on the common conditional serodiscordance parameter CDM ¼ þ in both models, similar effects were obtained for the effect of 'HIV prevalence' and 'Union number woman'. Additional to those effects, our new model identified a significant effect of 'STI man' and 'Wealth index': the probability for both partners of a couple to differ in HIV status, given that at least one of both is Note: Estimates (standard errors) of the final model with no random EA-effect for parameter ij and correlated random EA-effects ðb þ,i , b À,i Þ on ð þij , Àij Þ, with variance components 2 þ , 2 À and correlation þÀ . -A-sign in a column refers to a non-significant effect at 5% after which the covariate was deleted (stepwise). a Significant at 5% level based on a likelihood ratio test. b Significant at 5% level, using 2 0,1 -mixture.
positive, decreases in case the man has reported STIs and increases in case the couple's wealth index is poorer rather than richer. The negative synchrony depends on the 'HIV prevalence' (the higher the prevalence the lower the synchrony), the 'Union number woman' (lower synchrony in case the woman has been married or lived with a man more than once) and the 'Wealth index' (higher synchrony for the middle category).
Finally for the homogeneity parameter , it can be observed that marginal homogeneity (both partners having the same probability to be HIV positive) does not hold in case the woman has been married or lived with a man only once and the man has indicated to have no STI symptoms (95% CI [0.166, 0.491] for , implying the probability to be HIV positive is lower for the woman) and in case the woman has been married or lived with a man more than once and the man has indicated to have STI symptoms (95% CI [0.517, 0.960] for , implying the probability to be HIV positive is higher for the woman).

Varicella Zoster Virus and Parvo B19 concordance
Let y 1 refer to the B19 infection status and y 2 to the VZV infection status of the same individual. We are interested in the dependency of , þ and À on age and gender. Table 3 shows the observed bivariate frequencies of (y 1 , y 2 ), unconditionally and conditionally on gender. Model (11) is applied with an indicator for gender and a cubic spline for age restricted to be linear before the first knot and after the last knot, with knots located at the 9 deciles of the observed age distribution. Gender was not significant for any parameter (p-values 0.8762, 0.2002, 0.8641 for , þ and À respectively). Age has a significant effect on all three parameters , þ and À (p-values 0.004, <0.0001, <0.0001, respectively). Figure 2 visualizes the dependencies on age. For an individual who is positive for one and negative for the other virus, the probability is lowest that he/she is positive for B19 (about 0.10) across all ages. Marginal homogeneity clearly does not hold, for any age (p < 0.00001). If an individual is positive for at least one virus, the probability þ that he/she is positive for both increases from 0.10 to about 0.80 with a strange bump at the age of 20. The negative conditional synchrony À , being the probability he/she is negative for both given that he/she is at most positive for one of the virus, decreases rapidly during the first 10 years of life, from about 0.80 to about 0.05. The bump around the age of 20, visible more or less for all curves, is caused by an ''artefact'' in the data, in the sense that the prevalence to be positive for one only, or for both is expected to be monotone as a function of age (the older you are the higher the probability of ever been infected by one of the diseases). Already Figure 1 shows that this expected monotonicity constraint is violated by the patterns shown in the data. This phenomenon and possible model modifications and extensions (covering, e.g., waning immunity) have been presented and discussed in, for example, Abrams and Hens. 15 Using a bivariate Dale model and splines for the effect of age, Hens et al. 6 could not reject the null hypothesis of a constant OR (p ¼ 0.37). The estimated age and gender independent OR equaled 2.11 with 95% confidence interval (1.45, 3.23). This is another example showing that measures for agreement and for dependency can behave quite differently.

Diagnostic performance and concordance of the Whooley questions
Let y 1 refer to the GSR and y 2 to the index test of the same individual. Bosanquet et al. 10 Table 4 shows estimates for all parameters of interest. Note that the homogeneity parameter ¼ 0:007 is very small, implying marginal heterogeneity and that, as noticed in general, Sp % À , PPV % þ and Se % NPV % 1. So, if index and reference test disagree, it is highly unlikely that the reference test is the positive one. If at least one of the index or reference test is positive, it is unlikely that both are positive (probability of 0.107), so low positive synchrony or concordance. If at least one is negative, they are both negative with probability 0.625. Next it is interesting to get more insight in the (con/dis)cordance between both Whooley questions, defining the index test, and how this depends on the GSR status. So, let y 1 now refer to WQ1 and y 2 to WQ2. Table 5 shows the estimated parameters, unconditionally and conditionally on the GSR status. SAS code for the model with parameters modelled as a function of the GSR status (MINI) is included in the Supplementary material.
The homogeneity probability does not depend on the GSR status (p ¼ 0.6189). As ?ð À , þ Þ, it is not necessary to refit a simplified model with no effect of the GSR status on . Indeed, fitting such model would lead to exactly the same results for þ j GSR ¼ 0 or 1 and À j GSR ¼ 0 or 1, and the GSR independent estimate for would equal the estimate 0.699 in the collapsed table, being the first line of results in Table 5. So, whether or not the individual is depressed, if the answers on the two WQ's disagree, the probability is about 70% that WQ1 is answered positively (95% CI being [0.616, 0.770], marginal homogeneity does not hold). So, individuals with disagreeing answers on both Whooley questions tend to have been more often bothered by feeling down, depressed, or hopeless, than bothered by little interest or pleasure in doing things, regardless of their GSR status.  But þ and À significantly depend on the GSR status, with p-values 0.0011 and 0.0103, respectively. The positive synchrony measure estimate þ increases from 0.520 to 0.849 implying that, given that at least one of the WQ's is answered positively, the probability that both are positively answered increases substantially for individuals suffering from a major depression disorder (according to GSR). On the other hand, the negative synchrony measure estimate À decreases from 0.778 to 0.286 implying that, given that at most one of the WQ's is answered positively, the probability that both are negatively answered decreases substantially for individuals suffering from a major depression disorder (according to GSR).
Bosanquet et al. 10 mentioned in the discussion that the use of the two-item version of the Whooley questions, rather than a three-item version, in which the respondent is asked to state whether they would like help for any difficulties reported, is a potential limitation of their study. However, evidence from recent studies using a third help question, does not provide a conclusive answer whether to include or not the third question. It would be interesting to study in more depth the con-and discordance between all three questions in order to come up with an improved index test.

Conclusions and discussion
As interest goes to modelling a genuine synchrony/concordance measure rather than a typical association measure such as the odds ratio, the joint distribution of matched pairs of binary data need to reparametrized accordingly. In this contribution, a new parameterization solved the existing permissibility issue with the conditional synchrony measure and related limitations of fitting appropriate models. This new parameterization is based on two synchrony measures, a positive and negative synchrony (or alternatively discordance) parameter, combined with a marginal homogeneity parameter, leading without any restrictions on any of these parameters to a permissible joint distribution for the matched binary pairs, thus facilitating the fitting of more flexible and appropriate models.
The usefulness of the new approach has been illustrated in three different areas of application in disease control and prevention. In the first application, the positive serodiscordance was the main parameter of interest, but also the additional negative seroconcordance provides alternative new insights in this field. While the same characteristic (HIV status) is measured for both partners of a couple in the first application, two different characteristics (VZV and B19 infection status) on one and the same individual are available in the second application. The negative and positive synchrony measures provide new information and insights in the joint process of acquiring both diseases having similar transmission routes. In a third, more distinct application, the accuracy of a screening test is to be assessed in relation to the true disease status (or a gold standard). The new synchrony measures allow to investigate the performance of the diagnostic test from another angle, different from but closely related to well-known accuracy measures such as sensitivity, specificity, predictive values, DOR, etc. and future use of these new measures will shed more light on their ultimate value in this particular field of application.
An advantage of the new approach is that, in case interest only goes to both synchrony measures and their models do not share any parameter in common with the model for the marginal homogeneity parameter, the disagreeing observations can be collapsed and the synchrony measures can be modelled by means of a simplified trinomial likelihood. It may be perceived as a disadvantage that the marginal success probabilities (such as the prevalence of one or both diseases) are not directly estimated or modelled as a function of covariates, as in the currently applied parameterizations. On the other hand, the homogeneity parameter still allows to investigate structural differences in both marginal parameters. Moreover, models for the new parameters , þ , À imply indirect models for the marginal success probabilities using their relation equations. A first interesting methodological topic for further research is the modification and application of the HIV model of the first example to same-sex couples, as already mentioned by Juga et al. 3 They suggested two approaches to deal with the exchangeability of both partners of a couple. But the parameterization proposed here offers an interesting third option. Indeed only the marginal homogeneity parameter depends on the order of the partners in a couple and would not be interpretable when using a random order (actually one would expect homogeneity in case of a random order). Being orthogonal to the other parameters, it would not affect the estimation of the synchrony measures.
Another interesting extension is to examine synchrony between three or more outcomes, as for instance three related infections in our second example. As the number of parameters grows exponentially with the outcomedimension, defining a full set of appropriate homogeneity and synchrony parameters needs careful considerations in view of the application of interest. For the last example, one often includes an inconclusive category, introducing one or both as a trinomial outcome. An extension in that direction poses interesting challenges. Also in the latter setting, one might like to account for an imperfect GSR by correcting for misclassifications.