LCSB R³
Responsible and Reproducible Research

Multi-cohort Machine Learning Approach Identifies Robust Predictors of Cognitive Impairment in Parkinson’s Disease#

Authors#

Rebecca Ting Jiin Loo, Graziella Mangone, Fouad Khoury, Marie Vidailhet, Jean-Christophe Corvol, Enrico Glaab

Abstract#

Background: Cognitive impairment is a prevalent and impactful non-motor symptom of Parkinson’s disease (PD), affecting 20-50% of newly diagnosed patients. Early detection of cognitive decline is important for timely intervention, but current prediction models lack generalizability across diverse PD populations. Aim: To develop and validate machine learning models using multi-cohort data to predict both mild cognitive impairment (MCI) and patient-reported cognitive impairment (PRCI) in PD patients. Methods: We applied a multi-cohort machine learning approach using data from three independent PD cohorts (LuxPARK, PPMI and ICEBERG). Different algorithms were used for classification and time-to-event analysis. Cross-study normalization and leave-one-cohort- out validation were applied to improve and assess model generalizability. Results: Multi-cohort models showed improved stability of performance measures compared with single-cohort models. Age at PD onset and visuospatial ability (measured by the Benton Judgment of Line Orientation) emerged as key predictors for both MCI and PRCI. Sex differences were observed for PRCI, with men more likely to report cognitive impairment. Non-motor symptoms, particularly autonomic dysfunction, were associated with an increased risk of cognitive decline. Conclusions: Our multi-cohort machine learning approach identified robust predictors of cognitive impairment in PD, providing improved generalizability over single-cohort studies. These findings offer practical insights for early detection and personalized management of cognitive decline in PD patients. Keywords: Parkinson’s disease; Cognitive impairment; Machine learning; Multi-cohort analysis; Mild cognitive impairment (MCI); Patient-reported outcomes; Predictive modeling; Visuospatial function

The source code used to produce the result is available at https://gitlab.com/uniluxembourg/lcsb/biomedical-data-science/bds/ml-cognitive-impairment.

Data availability#

The LuxPARK clinical dataset used in this study was obtained from the National Centre ofExcellence in Research on Parkinson’s Disease (NCER-PD). The dataset for this manuscript is not publicly available as it is linked to the Luxembourg Parkinson’s Study and its internal regulations. Any requests for accessing the dataset can be directed to request.ncer-pd@uni.lu.

Data used in the preparation of this article were obtained on May 9, 2024 from the Parkinson’s Progression Markers Initiative (PPMI) database (www.ppmi-info.org/data, RRID:SCR006431). For up-to-date information on the study, please visit the PPMI website. Data from the ICEBERG cohort analyzed during this study is available from the corresponding study group (jean-christophe.corvol@aphp.fr, marie.vidailhet@aphp.fr).