Which Neuropsychological Tests? Predicting Cognitive Decline and Dementia in Parkinson’s Disease in the ICICLE-PD Cohort

Background: Cognitive impairment is common in Parkinson’s disease (PD), with 80% cumulatively developing dementia (PDD). Objective: We sought to identify tests that are sensitive to change over time above normal ageing so as to refine the neuropsychological tests predictive of PDD. Methods: Participants with newly diagnosed PD (n = 211) and age-matched controls (n = 99) completed a range of clinical and neuropsychological tests as part of the ICICLE-PD study at 18-month intervals over 72 months. Impairments on tests were determined using control means (<1-2SD) and median scores. Mild cognitive impairment (PD-MCI) was classified using 1-2SD below normative values. Linear mixed effects modelling assessed cognitive decline, while Cox regression identified baseline predictors of PDD. Results: At 72 months, 46 (cumulative probability 33.9%) participants had developed PDD; these participants declined at a faster rate in tests of global cognition, verbal fluency, memory and attention (p < 0.05) compared to those who remained dementia-free. Impaired baseline global cognition, visual memory and attention using median cut-offs were the best predictors of early PDD (area under the curve [AUC] = 0.88, p < 0.001) compared to control-generated cut-offs (AUC = 0.76–0.84, p < 0.001) and PD-MCI (AUC = 0.64–0.81, p < 0.001). Impaired global cognition and semantic fluency were the most useful brief tests employable in a clinical setting (AUC = 0.79, p < 0.001). Conclusion: Verbal fluency, attention and memory were sensitive to change in early PDD and may be suitable tests to measure therapeutic response in future interventions. Impaired global cognition, attention and visual memory were the most accurate predictors for developing a PDD. Future studies could consider adopting these tests for patient clinical trial stratification.


INTRODUCTION
Cognitive impairment is a frequent non-motor complication of Parkinson's disease (PD), with up to 80% of patients developing dementia (PDD) 20 years after disease onset [1]. Currently, there are no effective treatments which slow cognitive decline in PD [2,3]. A limiting factor in the development of interventions has been the identification of appropriate outcome measures that are sensitive to change due to medication response compared with normal aging. Commonly used neuropsychological tests may not be sensitive to change over time [4], and therefore not ideal cognitive measures for identifying clinically meaningful improvements due to pharmacological or non-pharmacological interventions [5].
Cognitive impairment in PDD is heterogeneous and multiple cognitive domains are involved [1,6]; impairment in global cognition [7], executive function [8,9], semantic fluency and visuospatial function [10], attention [9,11] and memory [12,13] have all been associated with the development of PDD. However, a number of these studies did not incorporate an age-matched control group to generate age-appropriate normative data [10,[14][15][16], included heterogeneous cohort of participants with different disease durations, small sample sizes [9,17], or only used a limited battery of neuropsychological tests [7,10,13,18,19]. While it is now clear that patients fulfilling PD-MCI criteria are associated with an increased risk of developing future PDD [11,14,[20][21][22]; from a clinical and research perspective it would be more useful to know which cognitive deficits and tests are most predictive of developing an early PDD. Understanding early impairments in specific neuropsychological tests and identifying optimal cut-offs that predict the early development of dementia is vital for future care planning and recruiting participants to clinical trials of potentially disease modifying therapies [4].
The aims of this prospective longitudinal study were firstly, to determine which cognitive tests were sensitive to changes, over and above normal aging, in those with early PD; and secondly, identify those cognitive tests that are most sensitive in predicting which participants will subsequently develop an early PDD. We hypothesised that neuropsychological tests measuring attention [11], semantic fluency and visuospatial function [10] would be associated with faster rates of cognitive decline in PD and the development of PDD, independent of normal aging.

Participants
Between June 2009 and December 2011, newly diagnosed PD patients from the community and hospital outpatient clinics in Newcastle-upon-Tyne, Gateshead and Cambridgeshire, UK were invited to participate in the Incidence of Cognitive Impairments in Cohorts with Longitudinal Evaluation in Parkinson's disease (ICICLE-PD) study (n = 211) [23]. Participants were re-assessed at 18-month intervals for up to 72 months. Idiopathic PD was diagnosed by a movement disorder specialist and fulfilled Queen's Square Brain Bank criteria [24]. Healthy control subjects were recruited from the community to provide age, sex and culturally appropriate normative data (n = 99). Full exclusion criteria have been published elsewhere [23]. Briefly, participants were excluded if they had significant cognitive impairment at presentation (Mini-Mental State Examination (MMSE) < 24) or a pre-existing diagnosis of dementia [6], an atypical parkinsonian syndrome, or insufficient English to complete assessments.
This study was approved by the Newcastle and North Tyneside Research Ethics Committee. All subjects provided written informed consent.

Assessments
Demographic information including age, gender and years of education was collected. Participants were assessed using the Movement Disorder Society (MDS) Unified Parkinson's Disease Rating Scale (MDS-UPDRS) part III [25], Hoehn and Yahr staging and Geriatric Depression Scale (GDS-15) [26]. Participants were assessed in an "on" motor state; levodopa equivalent daily dose (LEDD) was calculated [27].
Participants completed a schedule of neuropsychological tests (Supplementary Table 1). Global cognitive function was assessed using the MMSE [28] and Montreal Cognitive Assessment (MoCA) [29]; the naming and sentence items from the MoCA were used to assess language. Executive function was assessed using the One Touch Stockings from the Cambridge Neuropsychological Test Automated Battery (CANTAB) [30], phonemic fluency (number of words generated in 60 s beginning with the letter F) and semantic fluency (number of animals generated in 90 s). Visuospatial function was evaluated using the pentagon copying item within the MMSE and graded using a modified 0-2 rating scale [31]. Memory was assessed using the Pattern Recognition Memory (PRM), Spatial Recognition Memory (SRM) and Paired Associate Learning (PAL) tests from CANTAB [30]. Attention was assessed using the Cognitive Drug Research (CDR) battery [32]: Simple Reaction Time (SRT), Choice Reaction Time (CRT) and Digit Vigilance (DV). Spatial working memory was assessed using the Spatial Working Memory test (SWM) from the CDR battery.
We identified mild cognitive impairment (PD-MCI) [33] as previously described by Yarnall et al. [23]. Participants were classified as PD-MCI if they reported subjective cognitive decline and performed 1, 1.5 or 2 standard deviations (SD) or more below the mean of appropriate norms (controls) on at least two neuropsychological tests across five cognitive domains: attention (PoA and digit vigilance accuracy), memory (PRM number correct, SRM number correct and PAL mean trials to success), executive function (OTS number solved on first choice, semantic and phonemic fluency), language (naming and sentence subsets from the MoCA) and visuospatial function (pentagon copying). In addition, PD-MCI was defined as an impaired MoCA (< 26) plus subjective cognitive decline. Subjective cognitive decline and functional independence were determined through semi-structured interviews with participants and/or their carers to enable a diagnosis of PDD to be made using the MDS criteria [6].
The medical notes of all participants who did not return for 72-month follow-up were also reviewed to capture any additional diagnosis of dementia. A diagnosis was made by expert consensus (Newcastle: AJY and RAL, Cambridge: CHWG and MC); date of diagnosis was recorded as the midpoint between research or clinic visits.

Statistical analysis
Statistical Analysis was conducted using SPSS (IBM Corp. V.24, USA) and R software (Version 3.4.0; R Foundation for Statistical Computing, Vienna, Austria). Data were examined for normality of distribution with visual histograms and Kolmogorov-Smirnov's test. Comparisons of means between two groups were performed using independent t-tests or Mann-Whitney U tests as appropriate. Ordinal data were compared using chi-squared tests. Survival and cumulative survival were calculated using Kaplan-Meier plots.
Within R, lme4 [34] was used to perform linear mixed effects modelling (LMEM) to determine change in cognitive measures from baseline to 72 months in PD vs. controls and in PD vs. PDD participants (see Supplementary Methods for details). Due to the longitudinal nature of this study, there were some missing data (Supplementary Table 2); LMEM is suitable for longitudinal data analysis because of its ability to handle missing data. Backwards stepwise Cox regression identified baseline predictors of PDD using a data driven approach (Supplementary Methods). Cognitive scores were dichotomised as impaired using: i) cut-offs at 1SD, 1.5SD and 2SD below control mean scores, and ii) using median scores (Supplementary Table 3). These cut-offs were used as 1, 1.5 and 2SDs below normative values and have been commonly applied in the literature, while a median cut-off is typically used in cohorts without a control group or where normative data are not available. An additional model using impaired median scores and pen and paper only tests (MoCA, MMSE, semantic fluency, phonemic fluency and pentagons) was performed to identify tests which may be useful in an outpatient setting. Finally, baseline PD-MCI classifications versus no cognitive impairment, using impaired MoCA (< 26) and 1SD (≤ 1SD but > 1.5SD), 1.5SD (≤1.5SD but > 2SD) and 2 SD (≤ 2SD) cut-offs were individually added to the basic model. Model fit was assessed using log likelihood ratios, and area under the curve (AUC) was calculated for each model using receiver operating characteristic (ROC) curves. For all analyses, we applied Benjamini-Hochberg multiple comparisons correction with a 5% false discovery rate.

Data availability statement
Unidentifiable data may be shared on request.

RESULTS
At baseline, after exclusions, 211 participants with PD with a mean disease duration of 5.6 ± 5.1 months (Table 1) and 99 healthy controls completed assessments (Supplementary Table 1). Over 72 months, 106 (50.2%) PD participants and 66 (66.7%) controls returned for in person evaluation. Mean duration of follow-up was 4.3 ± 2.2 years (median = 6.0 IQR = 3.0 years). Of the 211 PD participants, 46 (cumulative dementia probability of 33.9%) developed PDD by the end of the 72-month assessment period, compared to two (cumulative dementia probability of 2.4%) controls who both were diagnosed with Alzheimer's disease dementia (χ 2 = 17.0, p < 0.001, Fig. 1). Mean time to dementia diagnosis was 3.8 ± 1.8 years in PD participants who reached this outcome and 5.5 ± 0.6 years in controls.

Cognitive change over time
Linear mixed effects modelling, adjusting for covariates, determined change in neuropsychological tests over time (Table 2) in control vs. PD participants, and in PD participants who remained   dementia-free vs. those who developed PDD. After corrections for multiple comparisons, only performance in tests of attention (CRT mean and PoA) were sensitive to decline in the PD participants over and above controls (and thus normal aging), and in those who developed PDD. However, performance in tests across many cognitive domains showed a more rapid decline in those participants who developed early PDD compared to those who did not (Table 2); specifically, global cognition (MoCA and MMSE), semantic and phonemic fluency, memory (PRM and PAL) and spatial working memory (SWM accuracy and speed). Additional measures of attention (digit vigilance accuracy, CRT accuracy, CoV fluctuating attention) declined in those who developed early PDD compared to those who did not (p < 0.023 for all). In controls and PD participants who did not develop dementia, phonemic fluency scores significantly improved over time (p < 0.023), suggesting a possible learning effect. To account for baseline PD-MCI, this analysis was repeated adjusting for baseline PD-MCI classification; this did not alter the results (Supplementary Table 4).

Baseline predictors of PDD
Backwards Cox regression identified that baseline median age (> 66 years, HR = 3.4, p < 0.001) and median MDS-UPDRS III score (> 26, HR = 2.7, p < 0.01) were significantly predictive of developing PDD earlier and were therefore included in each model predicting PDD progression (Table 2). Impaired global cognition (MoCA) and attention (PoA) at baseline were significant predictors of developing an early PDD in every model (HR = 3.2-4.9 and HR = 2.3-3.9, respectively), irrespective of the cutoff applied. Impaired attention as measured by digit vigilance accuracy was a significant predictor in the 1SD, 2SD and median scores models (HR = 2.4-4.9), but not 1.5SD. Impaired memory was also a predictor in each model, but the specific tests varied, with SRM (HR = 2.1 and HR = 3.1, respectively) significant in the 1SD and 1.5SD cut-off models, impaired PRM (HR = 5.0) using the 2SD cut-off, and impaired PAL (HR = 3.4) using the median cut-off. Impaired visuospatial function, based on pentagon copying, was only a significant predictor in the 2SD model (HR = 12.2). Using the pen and paper tests only, impaired MoCA (< 26, HR = 5.7) and semantic fluency (< 21, HR = 2.9) were significant predictors of PDD. Comparing the models (Fig. 2), AUC ranged from 0.758-0.876 (p < 0.001 for all, Table 2). The model using median scores of neuropsychological tests was the best fit, demonstrated by the lowest log-likelihood (267.6, p < 0.001) and highest AUC (AUC = 0.876, p < 0.001) compared to other models.

Baseline PD-MCI and conversion to PDD
Of the 46 participants who developed PDD, 45 (97.8%, Fig. 3) had PD-MCI at baseline (using < 1-2SD criteria). Most participants with normal cognition at baseline (n = 72, 98.6%) remained dementia-free over 72 months. Of the participants who met PD-MCI criteria at baseline, the numbers developing PDD within 72 months were: six (11.8%) of those meeting PD-MCI 1SD criteria (≤1SD below controls but > 1.5SD), 17 (39.5%) of those meeting PD-MCI 1.5SD criteria (≤ 1.5SD below controls but > 2SD); and 22 (50.0%) of those meeting PD-MCI 2SD criteria. Cox regression modelling with each baseline PD-MCI classification and covariates found that PD-MCI defined by impaired MoCA or 1.5SD and 2SD cut-offs significantly predicted the development of PDD (p < 0.01 for all, Table 3), but not PD-MCI defined by the 1SD cut-off (p > 0.05). However, PD-MCI 2SD had the highest hazard ratio (HR = 21.9, p < 0.001), the best model fit ( Table 3, p < 0.001) and greatest AUC (AUC = 0.814, p < 0.001). Comparing all models, the model fits using PD-MCI criteria had lower predictive power compared to the model of impaired median global cognition, memory, and attention (AUC = 0.876, p < 0.001, Table 3).

DISCUSSION
We sought to identify tests that are sensitive to change over time in those with PD above normal ageing and refine the neuropsychological tests predictive of developing an early PDD. In our cohort, the cumulative probability of PDD was 34% within six years compared to 2% of controls who developed dementia. This is comparable to conversion rates reported by previous studies of 17-27% over 4-7 years after PD diagnosis [35][36][37]. This is the first study to demonstrate that performance in tests of attention are sensitive to changes over time in individuals with newly diagnosed PD, independent of normal ageing, whereas deficits on tests of global cognition, memory, verbal fluency and spatial working memory in addition to attention are more sensitive to cognitive decline in those who develop an early PDD. We propose that performance in these tests may be suitable outcome measures of therapeutic response in clinical trials, as well as informing clinicians planning for the future medical needs for patients. We found that identifying PD-MCI at baseline, while predicting future PDD, does so with much less accuracy than impairments in specific tests. In keeping with previous studies, baseline impaired global cognitive function [7], attention [11] and visual memory [12] consistently predicts future PDD across a range of cut-offs, and therefore, may have utility for patient stratification in clinical trials. In addition, we found that impaired MoCA and semantic fluency scores were predictive of early dementia and could be useful screening tools for routine clinical practice.
A paucity of studies has explored which neuropsychological tests are sensitive to change over time in PD above that seen with normal aging, in comparison to studies that have sought to identify baseline predictors of an early PDD. We found that participants with PD and PDD declined in attention scores at a faster rate over time compared to normal ageing, but only tests of global cognition, memory, verbal fluency and spatial working memory were sensitive to cognitive decline in those who develop PDD. A number of previous studies have shown associations between PDD and poorer performance on a range of global and  specific cognitive domains (including attention, memory and executive function [7,8,[10][11][12]), although many are cross-sectional. A previous study in PD, albeit in a smaller sample size (n = 59 PD participants vs. n = 40 controls over 5 years), reported that the performance of PD participants significantly declined at a faster rate in tests of psychomotor speed, memory, executive function and visuospatial function compared to controls, although associations with subsequent dementia were not assessed [17]. A retrospective study in PD (n = 118 over 5 years) reported that participants who subsequently developed PDD showed a significantly faster rate of decline in performance in visuospatial and verbal memory tests compared to those with AD, and that cognitive decline preceded the diagnosis of dementia by several years [13]. In advanced dementia-free PD (PD duration ≥ 10 years, n = 49), worsening attention and executive function have been associated with development of PDD [9]. Furthermore, impaired attention has been associated with Lewy body pathologies [38,39]. In participants with a REM sleep behavioural disorder, impaired test performance in measures of attention and executive function were observed six years prior to a dementia with Lewy body diagnosis [40]. These findings support our results, which suggests that the performance on selective tests of global cognition, attention, memory, verbal fluency, and spatial working memory we identified in this study are sensitive to change over time in early PDD, and so could be considered as outcome measures in future clinical trials investigating interventions targeting cognitive decline or for inclusion in trials of disease modifying agents. This is the first study to explore a range of cutoffs of neuropsychological tests using normative data.
We identified that baseline impaired global cognition, attention and memory consistently predict the development of future PDD across a range of cutoffs, specifically, impaired MoCA and PoA were significant predictors in every model; digit vigilance accuracy was impaired in all but one model (1.5SD below controls), with variations of impaired visual memory across models (SRM, PRM or PAL). Consistent with the CamPaIGN study [10], pentagon copying was predictive of early PDD in addition to impaired global cognition, attention and memory, but only in the 2SD model (< 1 for 2SD compared to < 2 for all other cut-offs). This may be due to the limited scoring of this measure (0, 1 or 2) which may be insensitive to subtle visuospatial dysfunction. Contrary to our hypothesis, models using median cut-offs had the strongest predictive power compared to cut-offs generated using normative data. A strength of this study is the inclusion of age-matched controls to provide normative data. However, on inspection of the cutoffs (Supplementary Table 3), those calculated using normative data (1-2SD below controls) had greater degrees of impairment compared to those using the median values. This suggests that using stricter agegenerated cut-offs may have not detected more subtle cognitive impairments. Previous studies have used a range of neuropsychological tests and from these it is unclear which tests should be used in research and clinical settings. A previous longitudinal study reported that the development of PDD over a mean of 5 years was associated with verbal, visuospatial and working memory deficits [13]. However, cutoffs used to identify impairment were not explored or identified. We propose that MoCA, PoA, digit vigilance accuracy and PAL could be utilised to identify those at risk of developing PDD in the future.
Using non-computerised tests that are commonly used in clinic, we found that impaired MoCA and semantic fluency significantly predicted PDD. The CamPaIGN study [10] reported that impaired baseline semantic fluency and pentagon copying were predictors of developing PDD within ten years of PD diagnosis, with another study also reporting that semantic fluency and figure copying was predictive of PDD [15]. However, impaired pentagon copying as a predictor of future PDD was only seen using the 2SD cut-off (scores < 1) in the present study, but not using other cut-off scores (< 2). We found that impaired performance on the MoCA (< 26), a brief measure of global cognitive function -which includes a cube-copying task (a visuospatial function test similar to the pentagon-copying task) -was a predictor of subsequent PDD. A recent study comparing three cognitive screening measures found that the MoCA was the only measure associated with development of PDD and a faster rate of progression to dementia [7]. This may be due in part to a greater weight of tests of attention, visuospatial and executive function tests in the MoCA compared to the MMSE [41,42]. Nonetheless, impaired MoCA and semantic fluency (90 s) had high accuracy with an AUC of 0.79, and this model was comparable to-or had a better fit than-models containing more comprehensive tests or PD-MCI. This suggests that these two commonly used tests may be suitable in routine clinical practice to identify patients likely to develop an early PDD, using cut-off scores of 26 and 21, respectively.
We showed that the presence of baseline PD-MCI using a 1.5SD or 2SD cut-off significantly increased the risk of developing PDD within six years. Overall, using 2SD had the best model fit as shown by the AUC, and was associated with 21 times the hazard of participants without baseline PD-MCI. However, half of participants with baseline PD-MCI 2SD (50%) remained dementia free at six years. Previous studies have reported that having PD-MCI using Level 2 MDS criteria predicts progression to PDD over time [14,15,43]. The ParkWest study reported the natural history of PD-MCI to PDD over a five-year period; 39% of participants with baseline or incident PD-MCI developed PDD over five years but the majority were dementia-free (61%) [35]. Similarly, Nicoletti et al. [11] reported that the presence of PD-MCI was associated with five times the relative risk of developing PDD, although participants had a longer disease duration (mean of three years at baseline) compared to participants in the present study (mean of 5.5. months). A further study explored different applications of PD-MCI criteria [20] and found that predictive accuracy did not significantly differ between the groups. Recently, studies have investigated appropriate cut-offs for PD-MCI to identify PD participants at risk of developing PDD [21,22]. Wood et al. [21] reported that the 1.5SD was the optimal cut-off to identify participants who developed PDD within a four-year follow-up period, with impairments in at least two tests in a single cognitive domain. However, disease duration was longer than the present study (mean symptom duration six years at baseline). In a pooled analysis of 467 participants from four cohort studies, consistent with our findings, Hoogland et al. [22] reported that applying a 1SD cutoff was not significantly associated with an increased risk of developing a PDD, but 1.5SD and 2SD cut-offs were after controlling for covariates. Moreover, they reported that 2SD was the optimal cut-off, with a hazard of over 11, which is in-keeping with our findings. However, previous studies have not explored the relative predictive value of different cut-offs of PD-MCI compared to individual neuropsychological test performance. Although PD-MCI was associated with an increased risk of PDD, we showed that impairments in specific cognitive domains had better predictive accuracy for PDD compared to PD-MCI. Therefore, using focused tests (MoCA, PoA, digit vigilance and PAL) predictive of PDD and sensitive to cognitive change has the potential to reduce large batteries of neuropsychological tests to minimise participant burden in future studies.
Our study has several strengths including using a large well-characterised cohort of newly diagnosed PD participants and an age-matched control group followed up longitudinally. We used a data-driven approach to identify neuropsychological tests sensitive to change over time and predictive of early PDD. Limitations include missing data (Supplementary Table 2) and participant attrition, which are challenging for all longitudinal studies. Participant attrition was 11-21% per time-point, and 50% cumulatively over 72 months. This is similar to attrition rates reported by previous cohorts including by Wood et al. (33% in four years) [21], by Aarsland et al. (66% in 8 years) [18] and in the CamPaIGN study (65% in 10 years) [10]. Not all participants completed tests from the CDR and CANTAB at every time point due to technical errors or a change in the protocol (54-and 72-month evaluations, Supplementary Table 2); to mitigate this, we used linear mixed effects modelling, which is able to better deal with missing data. However, a significant proportion of participants had missing CANTAB and/or CDR data at 54 months, and all had missing data at 72 months, which has implications for identifying tests sensitive to cognitive decline. Participants who declined further assessments, were lost to follow-up or died may be participants who had a faster rate of motor and cognitive decline and would have been of particular interest to this study. To mitigate this, we reviewed medical notes to ensure we identified all cases who developed PDD during the study duration.
Impaired baseline MoCA was a consistent predictor of PDD using all cut-offs, which includes tests of multiple cognitive domains; this may have implications for single-domain tests predictive of PDD. Some participants improved in their neuropsychological assessment scores which could be due to a learning effect or normal fluctuations in cognition. We used a time interval of 18 months between testing, which has been suggested as an appropriate length of time to negate practice effects; however, future studies could consider using different versions of the same neuropsychological test. Finally, this study was conceived and initiated before the PD-MCI criteria had been published by the MDS and thus our testing of visuospatial function and language was more limited than that recommended by the MDS criteria [33]. We, therefore, used a modified version of these criteria.
In conclusion, poorer performance on selective tests of global cognition, memory, verbal fluency and spatial working memory are associated with a faster rate of decline in those who develop an early PDD. PD-MCI using 2SD cut-off may be a suitable screening tool for the development of PDD within six years, but with less accuracy than impaired global cognition, attention and memory, while impaired MoCA and semantic fluency are useful screening tools for routine outpatient clinical practice. Future studies are required to understand how these cognitive deficits are associated with the pathophysiology of PDD, as well as their utility in trials of putative diseasemodifying therapies.