Validity and reliability of a pilot scale for assessment of multiple system atrophy symptoms

Multiple system atrophy (MSA) is a rare progressive neurodegenerative disorder for which brief yet sensitive scale is required in order for use in clinical trials and general screening. We previously compared several scales for the assessment of MSA symptoms and devised an eight-item pilot scale with large standardized response mean [handwriting, finger taps, transfers, standing with feet together, turning trunk, turning 360°, gait, body sway]. The aim of the present study is to investigate the validity and reliability of a simple pilot scale for assessment of multiple system atrophy symptoms. Thirty-two patients with MSA (15 male/17 female; 20 cerebellar subtype [MSA-C]/12 parkinsonian subtype [MSA-P]) were prospectively registered between January 1, 2014 and February 28, 2015. Patients were evaluated by two independent raters using the Unified MSA Rating Scale (UMSARS), Scale for Assessment and Rating of Ataxia (SARA), and the pilot scale. Correlations between UMSARS, SARA, pilot scale scores, intraclass correlation coefficients (ICCs), and Cronbach’s alpha coefficients were calculated. Pilot scale scores significantly correlated with scores for UMSARS Parts I, II, and IV as well as with SARA scores. Intra-rater and inter-rater ICCs and Cronbach’s alpha coefficients remained high (> 0.94) for all measures. The results of the present study indicate the validity and reliability of the eight-item pilot scale, particularly for the assessment of symptoms in patients with early state multiple system atrophy.


Background
Multiple system atrophy (MSA) is a rare progressive neurodegenerative disease characterized by autonomic dysfunction, Parkinsonism, and ataxia [1,2]. MSA patients generally need wheelchairs in five years and die in ten years from disease onset. Though some underlying mechanisms of MSA have been revealed, such as the aggregation of α-synuclein to oligodendroglia, the complete pathogenesis of the disease remains to be elucidated [3]. As quantitative biomarkers for MSA have not yet been developed for use in clinical trials, clinicians must rely on evaluations of changes in symptoms. However, the usefulness of such evaluation varies according to the scale used, and the large numbers of patients required for MSA trials render redundant and unresponsive scales impractical. Therefore, a brief yet sensitive scale is desirable for clinical trials involving patients with MSA.
In a previous study, we compared the following five scales in their ability to assess symptoms of MSA [4]: Unified MSA Rating Scale (UMSARS) [5], Scale for the Assessment and Rating of Ataxia (SARA) [6], Berg Balance Scale (BBS) [7], MSA Health-Related Quality of Life scale (MSA-QoL) [8], and Scales for Outcomes in Parkinson's Disease-Autonomic Questionnaire (SCOPA-AUT) [9]. We subsequently devised a simple pilot scale comprised of eight items representative of those exhibiting the largest standardized response means (handwriting, finger taps, transfers, standing with feet together, turning trunk, turning 360°, gait, and body sway) [4].
Our prior study revealed that the UMSARS Part II (motor examination), Part IV (global disability scale, SARA, and BBS are effective in evaluating MSA progression over 12 months, indicating their potential to assess rapid changes in MSA symptoms. Detailed item-by-item analyses suggested that the largest SRMs were obtained for the following items: handwriting, finger taps, transfers, standing with feet together, turning trunk, turning 360 degrees, gait, and body sway. Further analyses revealed that our eight-item semi-quantitative (total score = 36 points) pilot scale (Table 1) exhibited an SRM larger than those observed for the UMSARS Part II/Part IV, SARA, and BBS [4], suggesting that the pilot scale was most effective in detecting rapid changes in symptoms of MSA. In the present study, we aimed to investigate the validity and reliability of the pilot scale for the assessment of symptoms in patients with both cerebellar and parkinsonian subtypes of MSA.

Methods
The present prospective observational study included hospitalized patients and outpatients receiving treatment in the Departments of Neurology at Hokkaido University Hospital and Obihiro Kosei Hospital between January 1, 2014 and February 28, 2015. Included patients had been diagnosed with probable or possible MSA per criteria defined in the 2008 consensus statement [10]. The present study was approved by the institutional review board of Hokkaido University Hospital. Written informed consent was obtained from all patients prior to their participation in the study. Those who declined to participate as well as those with severe cognitive impairments such as inability to understand explanations or to follow instructions in examination were excluded.
Previous reports utilizing both SARA and BBS were consulted in the design of the present study [11,12]. Patients were separately evaluated by two independent neurologists. Patients first underwent evaluation by Rater 1 using the UMSARS, SARA, and pilot scale. Rater 2 evaluated patients using the pilot scale alone on the same day. Within one month, patients underwent reevaluation by Rater 1 using the pilot scale. Each trial was performed blindly, under the same conditions, and in avoidance of acute phases in order to eliminate the influence of sudden changes in symptoms. No interventions were utilized in the present study, and patients were allowed to continue treatments (mainly drug and rehabilitation) already in progress. Amassed data were subjected to linkable anonymizing, following which statistical analyses were performed. Table 1 The items of the pilot scale 1. Gait (from SARA 1) Patient is asked (1) to walk at a safe distance parallel to a wall including a half-turn (turn around to face the opposite direction of gait) and (2) to walk in tandem (heel-to-toe) without support. 0. Normal, no difficulties in walking, turning, or walking in tandem (up to one misstep allowed) Inter-rater and intra-rater reliability for the pilot scale was assessed between Rater 1 and Rater 2. The total score as well as individual item scores for the pilot scale were analyzed based on Cronbach's α coefficients and intraclass correlation coefficients (ICCs). Items with Cronbach's α coefficients of more than 0.8 were considered to exhibit high internal consistency. ICCs were interpreted in conformity to the reference as slight (0.000 to 0.200), fair (0.201 to 0.400), moderate (0.401 to 0.600), substantial (0.601 to 0.800), or almost perfect (0.801 to 1.000) [13]. Mean values were presented along with standard deviations (SD).

Results
A total of 32 patients ( (Fig. 1a). Total scores on each scale are presented in Table 2. Average scores for the first assessment were as follows: UMSARS Part I: 21.3/48, UMSARS Part II: 21.3/56, UMSARS Part IV: 3.0/5, SARA: 19.3/40, pilot scale: 20.8/36. There was no significant difference in the total score of the pilot scale between MSA-C and MSA-P (average total score of MSA-C: 21.2, MSA-P: 20.2). The same thing was also confirmed for each item's score. Both total and individual item scores on the pilot scale significantly correlated with scores on UMSARS Parts I, II, and IV as well as SARA scores (Fig. 1b). Spearman's correlation coefficients ρ were 0.8780-0.9392. No significant differences were observed between each assessment of the pilot scale (Wilcoxon's rank test: p = 0.898 to 0.973). Table 3 depicts the distribution of scores assigned by Rater 1 during the first assessment. Scores for the second and third assessments showed similar tendencies. Many items had high item-total correlation coefficients (Spearman's correlation coefficients: 0.525 to   Table 4. Inter-rater and intra-rater ICCs and Cronbach's α coefficients for total pilot scores were both greater than 0.9. Further, inter-rater and intra-rater ICC values over 0.6 (substantial) were obtained for almost all items on the pilot scale: Only item 2 exhibited a moderate interrater ICC. Cronbach's α coefficients were greater than 0.9 for all items. Additionally, we considered prototype pilot scales consisting of five to seven items by excluding either a single item or a combination of three items (item 1: hand writing, item 2: finger taps, item 5: turning trunk) with relatively low inter-rater ICCs from the

Discussion
Patients in the present study exhibited characteristics similar to those reported in previous studies of Asian/ Japanese populations (Table 2) [14][15][16]. The distribution of UMSARS Part IV scores indicated that this study included relatively unbiased patients with mild to severe symptoms. Scores on the pilot scale significantly correlated to scores obtained on the UMSARS and SARA (Fig. 1b), indicating the criterion-related validity of the pilot scale. The ability to administer this pilot scale in a short period of time further suggests its usefulness in the evaluation of MSA symptoms (Fig. 1a). In addition, ICC and Cronbach α coefficients remained high (Table 4), indicating high intra-and inter-rater reliability. Test-retest reliability and internal consistency were also high. When either one or three low inter-rater ICC items were excluded from pilot scale (Table 4), ICCs and Cronbach's α coefficients remained relatively unchanged, indicating that a scale consisting only of items related to gait/standing is equally useful in assessing symptoms of MSA.
The present study possesses some limitation. Pilot scale items with low inter-rater ICC (handwriting, finger taps, turning trunk) exhibited ambiguity with respect to differentiating between scores. Further improvement in these areas of evaluation is required in order to more accurately assess changes in MSA symptoms. One such possibility involves combined assessment utilizing both the pilot scale and a gait accelerometer to record quantitative data. In addition, semi-quantitative scales such as that utilized in the present study often exhibit a ceiling effect [17]. This pilot scale also might show a ceiling effect among patients of advanced stage. Then, this pilot scale was not suitable for advanced MSA patients. On the other hand, in clinical trials, many participants would be early cases with mild to moderate symptoms, so influences of a ceiling effect were thought to be less likely. Further investigation regarding this point is required to more fully examine the effect of time course on the utility of the pilot scale. Additionally, this study included patients with mild to severe symptoms of MSA. And MSA-P patients were indeed relatively few. It is desirable that the reliability of this pilot scale would be presented in a larger cohort.
The SARA score of MSA-C group was similar to that of MSA-P group in this study. It should be noted that the score of SARA may be influenced by other symptoms such as Parkinsonism. The pilot scale of this study reflected symptoms of Parkinsonism and ataxia. It can be applied equally in both group without any modification. And the pilot scale showed larger standardized response mean than SARA and UMSARS [4]. It meant the pilot scale could sensitively capture symptom changes among MSA patients. The pilot scale was superior to SARA in terms of sensitivity even if it took some more time (5.0 ± 1.5 min) than SARA (3.8 ± 1.0 min). It is

Conclusions
The results of the present study indicate that the eightitem pilot scale for the assessment of MSA symptoms is both valid and reliable and may be useful for evaluation of patients in the early stages of MSA. However, due to the limitations of the present study and small sample size, further research involving improved scales as well as larger patient populations is required.