Machine Learning Can Help Predict Persistence of Asthma as Children Age

Larry Hanover

Machine learning models can help predict which children diagnosed with childhood asthma before age 5 will continue to experience symptoms as they get older.

Researchers have determined that machine learning models demonstrate good performance in predicting which children diagnosed with asthma before age 5 years will continue to experience symptoms as they get older.

Investigators from Children’s Hospital of Philadelphia and the Perelman School of Medicine at the University of Pennsylvania trained 5 machine learning models to differentiate between young children whose asthma is transient and those whose symptoms will persist and require treatment visits between ages 5 and 10 years, according to a study published in PLoS One.

Machine learning is a type of artificial intelligence that enables computers to learn without being explicitly programmed. As a program is exposed to more data, it becomes better able to recognize patterns over time.

Asthma affects approximately 6.1 million American children. However, a number of children are diagnosed with asthma at an early age yet ultimately prove not to have it on a chronic basis. Determining which children’s symptoms will resolve on their own would prevent unnecessary treatment, potential associated adverse effects, and alterations in quality of life for both children and their families, the authors said.

The authors used a retrospective data set containing the electronic health record data of 9934 children to see if they could identify which children ended up with a persistent diagnosis of asthma. All models performed significantly better than random chance, with XGBoost demonstrating the best performance.

Key features in the machine learning algorithms for predictive accuracy were total number of asthma-related visits, self-identification as Black, allergic rhinitis, and eczema, the authors wrote.

A lack of prior models using a large number of features to predict persistent asthma meant that direct comparison was not feasible. However, the authors wrote, the models were consistent with prior research showing that diagnosis age and prior utilization of health services were important predictors of persistent asthma and could prove useful in guiding clinicians and parents on asthma treatment in early childhood.

Early models relied on the occurrence of early childhood wheezing episodes. More recent statistical models were developed to identify preschool children with asthma-like symptoms who are at high risk of future asthma diagnosis.

None of the prior studies, however, provided models that could be individualized to predict chronic asthma. The study is believed to be the first comprehensive investigation of modern machine learning algorithms used to predict persistent diagnosis by employing large-scale electronic health record data.

The study cohort included children between ages 2 and 5 years who had an initial asthma diagnosis recorded during an inpatient stay, ambulatory visit, or emergency department visit from 2005 to 2016. Other conditions for a child to be considered to have persistent asthma were at least 1 additional diagnosis between ages 5 and 10 years, and prescription of an asthma-related medication at least once during or after an asthma diagnosis visit occurring after age 2 years.

There were 8802 children (89%) with a persistent asthma diagnosis in the data set; the remaining 1132 (11%) children were considered to have transient asthma.

The rationale for developing a model for asthma diagnosis past the age of 5 years was guided by the National Heart, Lung, and Blood Institute Expert Report Panel’s asthma guidelines, which divide childhood asthma diagnosis and management recommendations into 3 age groups: 0 to 4, 5 to 11, and 12 to 17 years. The authors noted that asthma diagnosis in the youngest group may be appropriate but is considered controversial; many children will wheeze from a viral illness without having classic asthma.

Each machine learning model performed significantly better than chance, the authors said, although the naive Bayes model had poorer performance. XGBoost, random forest, and logistic regression performed best. K–nearest neighbor performed better than naive Bayes but not as well as the other 3 algorithms. The researchers focused on determining an accurate threshold, as setting it too low could increase the number of false positives whereas setting it too high could increase the number of false negatives.

Capillary blood lead testing was an important predictor of persistent asthma, as was prescription of montelukast, an asthma control medication, according to the study. Inhaled corticosteroids were less important to the model.

The researchers found that socioeconomic status and patient sex were not statistically associated with asthma persistence.

One key limit to the study, the authors wrote, is that patients were concentrated in the northeastern part of the United States and thus the results might not be generalizable to other regions. Further research is warranted to test and improve the model’s generalizability by adding other input features.

Reference

Bose S, Kenyon CC, Masino AJ. Personalized prediction of early childhood asthma persistence: a machine learning approach. PLoS One. 2021;16(3):e0247784. doi:10.1371/journal.pone.0247784