Natural History of Multiple System Atrophy in North America: A Prospective Cohort Study

Background Multiple system atrophy (MSA) is a rare, fatal neurodegenerative disorder exhibiting a combination of parkinsonism and/or cerebellar ataxia with autonomic failure. We report the first North American prospective natural history study of MSA, and the effects of phenotype and autonomic failure on prognosis. Methods 175 subjects with probable MSA, both MSA-P and MSA-C, were recruited and prospectively followed for 5 years with evaluations every 6 months in 12 centers. Natural history was evaluated by Kaplan-Meier survival analysis. We compared MSA-P with MSA-C and evaluated predictors of outcome. These subjects were evaluated with UMSARS I (a functional score of symptoms and ability to undertake activities of daily living), UMSARS II (neurological motor evaluation), and the Composite Autonomic Symptoms Scale (COMPASS)-select (a measure of autonomic symptoms and autonomic functional status. Findings Mean age of symptom onset was 63.4 (SD 8.57) years. Median survival from symptom onset by Kaplan-Meier analysis was 9.8 years (95% CI 8.8-10.7). Subjects with severe symptomatic autonomic failure (symptomatic orthostatic hypotension, urinary incontinence) at diagnosis had a worse prognosis, surviving 8.0 years (95% CI, 6.5-9.5, n=62) while remaining subjects survived a median of 10.3 years (95% CI, 9.3-11.4, n=113). At baseline MSA-P (n=126) and MSA-C (n=49) were not different in symptoms and function, UMSARS I, 25.2 (8.08) vs 24.6 (8.34), p=0.835; UMSARS II, 26.4 (8.77) vs 25.4 (10.51), p=0.7635; COMPASS_select), 43.5 (18.66) vs 42.8 (19.56), p=0.835. Progression, evaluated by change in UMSARS I, UMSARS II, COMPASS_select over the next 5 years, was not significantly different between MSA-P and MSA-C. Median time to death from enrollment baseline was 1.8 (95% CI, 0.9-2.7) years. Interpretation Probable MSA represents late-stage disease with short survival. Natural history of MSA-P and MSA-C are similar. Severe symptomatic autonomic failure at diagnosis is associated with worse prognosis. Funding National Institutes of Health (P01 NS044233), Mayo CTSA (UL1 TR000135), the Kathy Shih Memorial Foundation, and Mayo funds.


Introduction
Multiple System Atrophy (MSA) is a neurodegenerative disorder expressing a combination of autonomic failure, parkinsonism and/or cerebellar ataxia, 1 with a disease annual incidence of 3/100,000 for subjects age 50-99 years. 2 Disease progression is typically inexorable. The cause of MSA is unknown, although likely linked to alterations in α-synuclein with subsequent formation of glial cytoplasmic inclusion and selective neuronal pathology. 3,4 Significant progress has been made to improve certitude of diagnosis. There is excellent agreement between Consensus Criteria 5, 6 and post-mortem confirmation of diagnosis. 7,8 Observational and retrospective studies including autopsy confirmed studies of MSA have provided important information on phenotype and natural history. 1,[9][10][11][12] Validation with prospective studies, however, has been more limited. Earlier studies 13,14 did not use validated MSA-specific instruments. Recently, a prospective natural history study of 141 MSA subjects followed over 2 years has provided novel information on MSA natural history in Europe. 15 We report here a North American prospective study of 175 MSA subjects followed over 5 years. We included both MSA-Parkinsonism (MSA-P) and MSA-Cerebellar (MSA-C) in order to compare their natural history. Key objectives of our study are to determine prospectively 1. the life expectancy of MSA subjects; 2. the influence of phenotype (MSA-P vs MSA-C) on natural history; and 3. prognostic indicators, especially if early onset of autonomic symptoms influenced prognosis.

Subjects and Evaluation
We studied subjects enrolled at twelve U.S. Neurology centers specializing in Movement and/or Autonomic disorders in an observational and risk factor study of MSA. 16 Subjects were followed biannually. All centers obtained Institutional Review Board approval. All subjects provided written informed consent and met Consensus Criteria for probable MSA. 5,6 Each investigator reviewed an UMSARS training video prior to enrolling subjects to ensure scoring consistency across sites. One hundred and seventy five subjects completed a baseline evaluation and were followed every 6 months thereafter for 5 years for available subjects. To minimize problems associated with delayed recall, we provided inclusion/ exclusion criteria for both diagnosis and symptoms. Baseline assessments were completed at the study facility and annually onsite thereafter. Questionnaires were sent via mail to subjects at the 6, 18, 30, 42, and 54 month time points; telephone interviews were completed by the enrolling physician to gather UMSARS data if the questionnaire data were not returned.
We followed Consensus criteria 5,6 for inclusion and exclusion of MSA and for designation of MSA-P and MSA-C. The full inclusion/exclusion criteria are provided in appendix A. Subjects were classified by MSA subtype based on study examinations, medical records and, as needed, information from the treating physician. Subjects were categorized as MSA-P if they exhibited parkinsonism but no cerebellar features and in whom parkinsonism preceded cerebellar signs by at least one year. For subjects with both cerebellar and parkinsonism, we designated them by onset of first symptom (ataxia or symptoms of parkinsonism). Onset of first symptom was determined from the EMSA-SG minimal data set which details patient symptoms and date of onset to the nearest month when these symptoms first developed. If the dates were not reported by patients, or they had difficulty with recalling onset, we resorted to other sources including relatives, spouses, and medical history to determine the date of onset. MSA-C subjects were defined as those with predominant cerebellar signs but minimal or no parkinsonism in whom cerebellar signs preceded parkinsonism by at least one year. Subjects with severe symptomatic autonomic failure were defined as orthostatic fall in blood pressure (by 30 mm Hg systolic or 15 mm Hg diastolic) or urinary incontinence (accompanied by erectile dysfunction in men) or both. Levodopa responsiveness was defined as a significant and sustained improvement in motor function observed by the patient after drug administration.
Baseline evaluation for MSA subjects-All subjects were screened for study enrollment at a baseline evaluation that included the following measures: demographic information, medical history, concurrent medications, neurological examination, mini mental state exam (MMSE), EMSA-SG minimal data set, Unified MSA Rating Scale (UMSARS), Composite Autonomic Symptoms Scale (COMPASS), SF-36 Health Survey, and Consensus Criteria assessment.
Follow-up evaluations for MSA subjects-Yearly onsite follow-up examinations and monthly survey data at 6, 18, 30, 42, and 54 months were included and consisted of the following measurements: review of concurrent medications, MMSE, EMSA-SG minimal data set, UMSARS, COMPASS-select, COMPASS-select-change, SF-36 Health Survey, and Consensus Criteria assessment. Low

Statistical analysis
Summary statistics were presented as mean (standard deviation), median (interquartile range) or frequency (percent) where appropriate. Baseline evaluation measures were compared using Mann-Whitney test or Students T-test. The frequencies of symptoms between groups were analyzed using Chi-Square tests when cell counts had ten or more observations. Fisher's Exact test was used to assess frequencies of symptoms between groups when cell counts were less than ten observations.
Kaplan-Meier analysis curves were used to analyze graphically the interval in years from first symptom onset to death and expressed as median values. Long-rank test statistics were used to determine whether Kaplan-Meier transition curves differed among subgroups. Cox proportional hazards models were used to calculate univariate hazard ratios for shorter survival using age at disease onset as continuous variable and gender, clinical phenotype, and early development of neurologic and autonomic manifestations as categorical variables. Proportional hazards assumption were tested using plots of scaled Schoenfeld residuals against transformed time for each covariate in a model fit using cox.zph function in R version 3.0.2. Statistical significance was defined at P<0.05. False discovery rate corrected p-values were reported as a way to adjust for multiple comparisons in the study. Data analyses were performed using the statistical software SPSS, version 21 and Kaplan Meier curves were generated using R version 3.0.2.

Role of Funding Source
NIH and Mayo funds supported the development of the study design and its implementation. This included database development, patient recruitment, study visits, and data collection. NIH, Mayo, and Shih Foundation funds supported the study but had no role in data analysis, data interpretation, and drafting of the manuscript. The corresponding author had full access to all the data in the study and had final responsibility for the decision to submit for publication. This decision was done in consultation with the co-authors to whom the dataset is also accessible.

Results
The  (Table 1). Baseline values for symptoms and function (UMSARS I) and deficits (UMSARS II), disability status (UMSARS IV), mental state (MMSE) as well as autonomic symptoms and function (COMPASS-select) were not different between MSA-P and MSA-C (Table 1). Baseline measurements of both components of SF-36 (Physical Health and Mental Health) showed a significant difference between MSA-P vs MSA-C ( Table 1). The flow chart ( Figure 1) shows the progressive reduction in subjects, mainly due to death beyond 24 months.
Clinical features showed significant differences in a number of domains (Table 2). Autonomic failure was uniformly present in both MSA-P and MSA-C groups. The major autonomic manifestations of orthostatic hypotension, neurogenic bladder (incontinence or incomplete bladder emptying), and constipation were present in >80% of subjects. OH was common in both phenotypes, with MSA-P having OH more commonly than MSA-C (82.5% vs 67.4%, p=0.0541). Medications to treat OH, depression, parkinsonism (in MSA-P), and neurogenic bladder was common (>40%) as was dietary supplements. Only levodopa was significantly different between MSA-P from MSA-C.
As expected, parkinsonian symptoms and cerebellar manifestations were more common in MSA-P and MSA-C, respectively. There is some merging of parkinsonism and especially cerebellar symptoms likely reflecting the late stage of disease. Of note is that 51.6% of patients derived some benefit from levodopa, which lasted a mean duration of 3.2 years.
We evaluated hazard ratios for key clinical features and scores from onset to death ( Table  3). The evaluation was on the effect of these variables at baseline on outcome. There was no effect of gender or age. None of the variables had an effect on outcome.
Progression  Table 4. . We designated the duration of 12 months to exclude subjects with longstanding less-specific autonomic symptoms. Median time to death for all subjects from enrollment was 1.8 (95% CI, 0.9-2.7) years, n=102. Median time to death for MSA-P from enrollment was 1.7 (95% CI, 0.9-2.9) years, n=76. Median time to death for MSA-C from enrollment was 2.0 (95% CI, 1.1-2.5) years, n=26.

Discussion
The main findings of this prospective study are that MSA-P and MSA-C have a similar natural history with a median duration from onset to death of 9.8 years. Symptoms (UMSARS I) and deficits (UMSARS II) were not different at baseline, and median time to death was only 1.8 years. This suggests that the Consensus Criteria for probable MSA 5, 6 ensures high diagnostic accuracy but achieves this at a late stage of the disease. The development of severe symptomatic autonomic failure at diagnosis was predictive of a worse prognosis, reducing life-span by 2.3 years.
The North American study, together with the European study, 15 comprise the only prospective studies on MSA evaluated with disease specific validated instruments. Particular strengths of the two studies are the shared minimal dataset at baseline and the shared instruments to evaluate a range of symptoms and deficits (Table 5). Together they comprise over 300 subjects with this rare disease. The North American study differed from the European study in the duration of study (5 years vs 2 years) and in the certitude of diagnosis (100% vs 77%) ( Table 5). A requirement for inclusion was probable MSA in our study whereas the European study accepted both possible and probable MSA. The number of subjects was similar in the two studies (175 vs 141). There was a similar distribution of MSA-P vs MSA-C and gender distribution. Both studies confirmed the dire prognosis of MSA. Remarkably, both studies have found an identical median duration of life from onset to death of 9.8 years ( Table 5).
Both prospective studies reported that a large percentage of subjects with MSA-P had a beneficial response to levodopa. Our study reported 56.7% of MSA-P while the European study reported 42.5% benefited. While we recognize that the response maybe suboptimal, it was surprisingly sustained in the 2 studies, with a duration of 3.3 years in our study and 3.5 years in the European study. This observation has clear implications for Consensus criteria of MSA-P and suggests that levodopa responsiveness should not be a requirement in the diagnosis of MSA-P.
There were a number of interesting differences (Table 5). A key finding of the European study was that subjects with the MSA-P had a significantly shorter life-span from baseline to death than those with MSA-C. We did not find a significant difference from symptom-onset or from baseline. One limitation in both our studies is that the number of subjects beyond 2 years is small, due to the high mortality rate (Figure 1). Of note is that the largest retrospective study to date, published in abstract only, 21 and the autopsy confirmed MSA studies did not find a difference in prognosis by MSA type. It is plausible that the shorter duration of life from baseline relates to delayed diagnosis of MSA-P. 21 In the retrospective Mayo study by Coon et al, 21 685 subjects were evaluated with MSA with follow up, and found that survival from symptom onset to death was identical for MSA-P and MSA-C, but was significantly shorter for MSA-P from baseline. We surmised that the short duration from diagnosis (and baseline) to death for MSA-P relates to the delay in diagnosing MSA-P (retaining diagnosis of parkinsonism) because of the dire outlook with MSA. A second major difference is that a key finding in our study is the worse prognosis of subjects with severe symptomatic autonomic failure at diagnosis (symptoms of orthostatic hypotension, neurogenic bladder, or fecal incontinence) compared with those without severe symptomatic autonomic failure. This is similar to the findings from a recent study of autopsy-confirmed MSA. 8 Kaplan-Meier curves were significantly different for MSA subjects demonstrating generalized autonomic failure on autonomic testing and also in subjects with neurogenic bladder within 3 years of onset of disease. The European study found a number of variables that suggested a worse prognosis. We did not find an effect of either age or gender or variables that predicted a worse prognosis from baseline to death. It is possible that this apparent discrepancy relates to the more advanced disease related to probable MSA at baseline. One limitation of both studies is the retrospective nature of defining symptom onset. This introduces a recall bias. We have attempted to minimize this bias by predetermining what constitutes symptoms of MSA and what does not, using the predefined minimal dataset. For instance, we did not accept erectile dysfunction, anosmia, constipation or REM sleep behavior as symptom onset. Instead we accepted only symptoms that were more specific for MSA and showed progression over time, such as neurogenic bladder or orthostatic hypotension. We also defined specified symptoms of neurogenic bladder as urinary incontinence or inability to void, discarding more trivial urinary symptoms.
This is the largest prospective study thus far to examine outcome measures in MSA patients. One strength of this study is that the study population consists entirely of patients with a diagnosis of probable MSA. Sixteen of the subjects died and all had their MSA confirmed by an autopsy. Nevertheless, as only patients with probable MSA are included in this study, we expect the potential for misdiagnosis to be low. For instance in an autopsy-confirmed MSA study of 29 subjects with autopsy confirmed MSA, 28/29 had the correct diagnosis of probable MSA antemortem. 7 The single exception had the phenotype of PAF in the single visit at Mayo Clinic and subsequently evolved into MSA. Another limitation of this research is it is not a population based study, in that patients are recruited tertiary movement disorder or autonomic centers. As such, results might not be generalizable to all USA based MSA patients. One of the limitations of this research is the fall-off beyond year 2, due to the high mortality rate.
Our findings on rate of progression, which are similar to those in the European study and a recently completed Rifampicin study, 22 have implications for the powering of randomized clinical trials. Considering only patients with a diagnosis of probable MSA for a potential therapeutic trial has disadvantages as well. Patients with probable MSA with a higher UMSARS score than possible MSA 11,22 have a flat slope in rate of change (Table 5), accounting for the very large number of subjects needed to power a randomized treatment trial of MSA using probable MSA. 16 In contrast, selection of subjects who are at an earlier stage of the disease results in a steeper slope and smaller number of subjects needed to power such a study (Table 5). 22 In the Rifampicin study, we imposed an entry criterion of UMSARS I≤16 (minus question 11), and observed a mean rate of change in UMSARS I score in the placebo group of 0.5 points (SD 0.5) per month. Using these data and assuming an equal SD in the treatment group, 64 participants would be required per group to detect a difference of 50% (ie, a slope of 0.5 points per month in the placebo group vs 0.25 points per month in the treatment group) with 80% power and an alpha level of 0.5 based on a twosample t test. Required sample sizes for 40% and 30% reduction in slope would have been 100 participants per group and 176 participants per group, respectively. This is a required number of evaluable patients at the end of the study. Assuming a death or dropout rate of 10% we would need to increase the sample size per group to 111 and 196 participants, respectively The number needed to power such a study, using early and milder disease is much smaller than a study of advanced probable MSA. 16

Acknowledgments
The work was supported by National Institutes of Health (P01 NS044233 -Low, K23NS075141 -Singer), Mayo CTSA (UL1 TR000135), the Kathy Shih Memorial Foundation, and Mayo funds.
We would like to thank all patients and families as well as referring physicians for their support.

Panel: Research in context
Evidence before this study We searched PubMed with the following search terms: [(MSA OR "multiple system atrophy") AND (progression OR survival)] for reports published before April 9, 2015. We found only a single report of a prospective study on MSA (Wenning et al 2013). 15 The natural history of MSA has been poorly understood.
There is a single prospective natural history study of MSA. 15 Prior to that the only randomized prospective studies used a non-specific Parkinson plus scale that was suboptimal for MSA. 13,14 The study by Wenning et al, 15 published in 2013, was the first natural history study that analyzed survival and prognostic predictors in a large homogeneous cohort of European patients with MSA and using validated disease-specific rating scales.

Added Value of this study
This 5 year prospective natural history study of subjects with probable MSA is the largest cohort study done over the longest duration. It found a median survival from symptom onset by Kaplan-Meier analysis was 9.8 years (95% CI 8.8-10.7). Subjects with severe symptomatic autonomic failure at diagnosis had a worse prognosis, surviving 8.0 years (95% CI, 6.5-9.5) while remaining subjects survived a median of 10.3 years (95% CI, 9.3-11.4). Natural history of MSA-P and MSA-C are similar. The study has implications for MSA diagnosis and design of clinical trials. Levodopa response is substantive, occurring in 56.7% of MSA-P and lasting 3.3 (2.33) years. Hence the requirement for lack of levodopa responsiveness for diagnosis of MSA is no longer tenable. The rate of progression has reached a plateau with minimal rate of increase in UMSARS I and II so that the study design using probable MSA requires an unacceptably large number of subjects.

Implications of all available evidence
Our study synergizes with the European study in a number of important ways. The studies were both large cohorts that shared identical disease-specific, validated instruments to evaluate progression of MSA and used identical minimal baseline datasets. The studies are complementary in that the North American study chose probable rather than possible MSA and extended our period of study to 5 years (from 2 years). Our study, although not population-based, selected subjects from all over the United States. Our finding of disease progression is stable whereas the European study reported a slowing of progression in year 2. The difference likely relates to the slow rate of progression related to advanced disease (probable MSA). The value of 0.30 is significantly lower than the slope of 0.50 we found in the Rifampicin study 22 and has implications for subject selection in future clinical trials. A key finding in our study is that severe symptomatic autonomic failure at diagnosis is predictive of poor survival. The North American study found identical prognosis (from symptom onset) for MSA-P and MSA-C. The observation by the European prospective study for worse prognosis for MSA-P from baseline could relate to delayed diagnosis of MSA-P. 21    i Criterion for autonomic failure in MSA is defined as Orthostatic fall in blood pressure (by 30 mm Hg systolic or 15 mm Hg diastolic) or urinary incontinence (accompanied by erectile dysfunction in men) or both.
ii Orthostatic Hypotension is defined as a drop in SBP of 20 mmHg or a drop in DBP of 10 mmHg.
iii Criterion for parkinsonism in MSA is defined as bradykinesia plus at least one of the following: rigidity, postural instability, tremor (postural, resting or both).
iv Criterion for cerebellar dysfunction in MSA is defined as gait ataxia plus at least one of the following: ataxic dysarthria, limb ataxia or sustained gaze-evoked nystagmus.