What Innovations Can Improve Detection of Predictors, Severity in Parkinson Disease?

May 22, 2020

Presented at Virtual ISPOR 2020, researchers examined the efficacy of a machine learning approach in detecting predictors of Parkinson disease (PD), with an additional study testing the use of a statistical model to predict severity of PD.

The complexity of Parkinson disease (PD) has made it difficult for physicians to not only predict disease progression but diagnose it as well. In a recent poll, 1 in 4 people with PD initially received a misdiagnosis, with 48% given treatment for their nonexistent condition. This could prove detrimental for patients with PD (PwP) as the mismanagement of symptoms could lead to OFF periods, which occur when symptoms are uncontrolled.

Recent innovations within PD have shown the potential of novel technology systems in distinguishing between the disease and similar neurodegenerative diseases such as multiple system atrophy, which may lessen the likelihood of misdiagnosis. Furthermore, PD subtypes derived from a novel PD subtyping system were significantly linked with disease duration and severity.

This finding was a notable step that warrants further growth, as the system may solely reflect stages of PD rather than identify distinct clinical subtypes. In an abstract presented at the Virtual ISPOR 2020 meeting, researchers examined the efficacy of a statistical model in predicting disease severity, with an additional abstract analyzing the ability of a machine learning approach in early PD detection.

Predicting PD Severity Via Statistical Model

Researchers sought to form a statistical model that predicts the 3-level version of the EuroQol 5 Dimensions system (EQ-5D-3L) as a function of PwP demographics and PD severity, as measured by the Unified PD Rating Scale (UPDRS) subscales.1 By characterizing the model through the EQ-5D-3L, a descriptive system comprising 5 dimensions—mobility, self-care, usual activities, pain/discomfort, anxiety/depression—researchers aimed to develop a predictive equation for utilities that would translate to an economic model to conduct cost-utility analysis.

PwP-level data were derived from the National Institute of Neurological Disorders and Stroke Exploratory Trials in PD Long-Term Study 1. This multicenter phase 3 study examined creatine in patients on dopaminergic therapy within 5 years of diagnosis (n = 1741; 6 years follow-up).

Researchers calculated EQ-5D-3L index scores using a mixed-effect model with repeated measures, as the mean utility values and UPDRS scores were comparable between the 2 treatment arms. The significant predictors of the utility values included gender and UPDRS I, 2, 3, and 4, with age being excluded from the multivariate model.

In the study findings, the average decline in utilities per year was 0.018, with declines identified as 0.81 at baseline, 0.76 at 3 years, and 0.70 at 6 years. The researchers highlighted that the statistical model performed well in validation analyses, in which average predicted EQ-5D-3L utilities were within +0.01 at all visits compared with the average observed scores for each year post baseline.

“The predictive equation for utilities captures the impact of nonmotor and motor-related aspects of the disease, as all 4 UPDRS subscales were identified as significant predictors,” concluded the study authors.

Machine Learning in Early PD Detection

As researchers note, early detection of PD in its prodromal period can provide timely treatment and mitigate symptoms and risks. To assist in identifying predictors in PwP, the study authors designed a machine learning technique that analyses Medicare Part A and B claims data.2

Researchers derived a 5% sample from quarter 1 of 2010 to quarter 4 of 2015 Medicare claims data to identify incident PD cases in 2015 based on diagnosis codes from the International Classification of Diseases (ICD) 9 and 10. Participants were 65 years and above and underwent continuous enrollment during the 2-year baseline period prior to their PD diagnosis, called the index date.

In the study, controls (n = 13,725) were identified at a 3:1 ratio to PD cases (n = 4575), with no evidence of PD diagnosis. Features included in the analyses were demographics, comorbidities, medication, and procedure utilization, and service location variables extracted at baseline.

“Data were partitioned using a 60%/20%/20% split to train, tune models, and test performance on unseen data. Traditional and regularized logistic regression, k-nearest neighbor, XGBoost, support vector machine, and random forest models were built, and the best model was selected using the area under the ROC curve (AUC),” expanded the study authors.

The study found that of the models tested, the XGBoost model performed best (on unseen data: AUC, 83.1%; accuracy, 79.6%; recall, 65.1%; precision, 58.2% and F1, 0.61). The researchers note that based on the model’s high predictive accuracy in identifying predictors of PD, further research is warranted.


1. Chandler C, Franco Villalobos C, Wang Y, et al. Development of a statistical model to predict EuroQOL five dimensions (EQ-5D) utilities in Parkinson disease. Presented at: ISPOR 2020; May 18-20, 2020; Abstract ND4. https://bit.ly/2zWY7jt

2. cten Z, Burns SM, Menzin JA. Predictors of Parkinson disease in a Medicare population—An application of machine learning in early disease detection. Presented at: ISPOR 2020; May 18-20, 2020; Abstract AI2. https://bit.ly/3go1dh8