Can Big Data Find Undiagnosed Cases of Type 2 Diabetes? UCLA Team Says Yes

Deploying the algorithm in EHRs on a widepread basis could help find up to 400,000 Americans with undiagnosed diabetes, the researchers say.

Left untreated, type 2 diabetes (T2D) can cause blindness, kidney failure and infections that could lead to amputations. Early on, however, some people who have T2D might not know they have the disease. Finding the undiagnosed with diabetes (and even prediabetes) is a major priority of the CDC and the American Medical Association.

But giving every patient who might show signs of T2D a blood test is time consuming and expensive. What if there was a faster way to zero in on those who might be at risk, based on past illnesses or other characteristics?

And what if Big Data could help sort through it all?

Researchers at the University of California Los Angeles launched such a project in 2012, mining thousands of electronic health records (EHRs) to identify common characteristics of people who have T2D, and in the process created an algorithm that could be used to screen large pools of candidates for future testing.

The results of their work, led by Ariana Anderson, PhD, also uncovered some highly unexpected risk factored associated with T2D—things like a history of sexual and gender identify disorders, intestinal infections, a illnesses that include sexually transmitted diseases such as chlamydia.

Their work, published today in the Journal of Biomedical Informatics, argues that broad application of their tool to EHRs could find some 400,000 persons with T2D who are currently going without treatment. Finding them early could prevent long-term complications, including potential disability. (In 2013, the American Diabetes Association calculated that the annual medical and lost productivity costs of the disease in the United States are $245 billion.)

“With widespread implementation, these discoveries have the potential to dramatically decrease the number of undetected cases of type 2 diabetes, prevent complications from the disease and save lives,” said Anderson, an assistance professor at UCLA’s Semel Institute for Neuroscience and Human Behavior.

The study involved records from 9948 people from all 50 states. Patients were de-identified, and researchers looked at information like vital signs, diagnoses, medications prescribed, and all kinds of reported ailments. Among the discoveries was that being diagnosed with a sexual or gender disorder increased the risk of T2D by 130%--nearly the same as having high blood pressure, which is a well-known risk factor for the disease.

Other conditions strongly associated with T2D included chlamydia, which increased the risk 82%; intestinal infections such as colitis, enteritis and gastroenteritis, 88%. Comparatively, having a high body mass index increases one’s risk of T2D by 101%.

More research would be needed to understand why certain medical conditions correlate with T2D. Because the diagnoses linked to T2D are based on ICD codes, the researchers report that the findings are not detailed enough to tell clinicians why there is a link between these codes and diabetes.

Current practice calls for screening people for T2D based on a limited list of risk factors, such as high blood pressure, body weight/BMI, age, and smoking status. But the tool developed by the UCLA team using the full EHR was 2.5 better at predicting whether a person had diabetes.

Anderson said deploying the tool would be a way to identify with more urgency those candidates who definitely needed a laboratory test to determine if they had T2D, even they show no symptoms.


Anderson AE, Kerr WT, Thames A, Li T, Xiao J, Cohen MS. Electronic health record phenotyping improves detection and screening of type 2 diabetes in the general United States population: a cross-sectional, unselected retrospective study. J Biomed Inform. 2016;54:162-168. DOI:

Related Videos
Related Content
© 2023 MJH Life Sciences
All rights reserved.