New Machine Learning Model Predicts Who Will Develop Diabetes

April 7, 2020
Gianna Melillo
Gianna Melillo

Gianna is an assistant editor of The American Journal of Managed Care® (AJMC®). She has been working on AJMC® since 2019 and has a BA in philosophy and journalism & professional writing from The College of New Jersey.

Researchers created a machine-learning–based model to help predict which patients will develop diabetes, according to an abstract to be published in the Journal of the Endocrine Society.

Researchers created a machine-learning—based model to help predict which patients will develop diabetes, according to an abstract originally slated to be presented at ENDO 2020 but that will now be published in a special supplement to the Journal of the Endocrine Society. The new model predicts the future incidence of diabetes with an overall accuracy of 94.9%.

“Currently, we do not have sufficient methods for predicting which generally healthy individuals will develop diabetes," said Akihiro Nomura, MD, PhD, a lead author of the study. This new tool aims to change that fact.

The artificial intelligence (AI) was created via a retrospective analysis of 509,153 annual specific health checkup records from 139,225 patients. In addition, 65,505 patients without diabetes mellitus (DM) were included in the dataset.

Gradient-boosting decision trees were used to identify DM signatures prior to the onset of the disease in patients. Between 2008 and 2018, the researchers collected a variety of records from patients in Kanazawa city, Ishikawa, Japan. The data included results of physical examinations and blood and urine tests, as well as questionnaires completed by study participants.

“Machine learning is a type of AI that enables computers to learn without being explicitly programmed,” according to the press release accouncing the study results. “With each exposure to new data, a machine-learning algorithm grows increasingly better at recognizing patterns over time.”

The researchers divided the dataset into a 6:2:2 ratio in order to first train the AI, then tune it by internal validation, and then to test the model. To determine the model’s accuracy, they evaluated its ability to calculate area under the curve (AUC), its overall precision, recall, and F1 score.

The training dataset included 36,303 participants, while the testing and tuning datasets consisted of 13,101 participants each.

During the study period, the researchers identified 4696 (97.2%) patients with new-onset DM. They found the trained model “predicted the future incidence of DM, with the AUC, precision, recall, F1 score, and overall accuracy [measuring at] 0.71 (95% CI, 0.69-0.72), 75.3% (71.6%-78.8%), 42.2% (39.3%-45.2%), 54.1% (51.2%-56.7%), and 94.9% (94.5%-95.2%), respectively.”

Machine learning could enable healthcare systems to precisely identify groups at high risk for developing diabetes and lead to effective intervention strategies.

In the future, the researchers plan to perform clinical trials “to assess the effectiveness of using statins to treat groups of patients identified by the machine learning model as being at high risk of developing diabetes.”