A Medicaid managed care organization developed a machine learning model to identify opioid use disorder (OUD) risk factors and predict OUD incidence in its multistate population.
Objectives: Medicaid managed care organizations are developing comprehensive strategies to reduce the impact of opioid use disorder (OUD) among their members. The goals of this study were to develop and validate a predictive model of OUD and to predict future OUD diagnosis, resulting in proactive, person-centered outreach.
Study Design: We utilized machine learning methodology to select a multivariate logistic regression and identify predictors.
Methods: Using 2016-2018 data, we used a staged approach to test and validate the predictive accuracy of our model. We identified OUD, the dependent variable, using an industry-standard definition. We included a series of patient demographic, chronic condition, social determinants of health (SDOH), opioid-related, and health utilization indicators captured in administrative data.
Results: Caucasian (odds ratio [OR], 1.65), male (OR, 1.57), and younger (aged 40-64 years compared with 18-39 years: OR, 0.75) members had greater odds of being diagnosed with an OUD. Members with an SDOH vulnerability had 26% higher odds than those without a documented issue. From a prescribing perspective, we found that having an opioid dose of 120 morphine milligram equivalents and contiguous 5-day supply increased odds of OUD by 1.87 times, and an opioid supply of 30 days or longer increased the odds of OUD by 1.56 times.
Conclusions: We built the necessary machine learning infrastructure to identify members with greater than 50% probability of developing OUD. The generated list strategically informs and guides person-centered care and interventions. Through application of these results, we strive to proactively reduce OUD-related structural barriers and prevent OUD from occurring.
Am J Manag Care. 2021;27(4):148-154. https://doi.org/10.37765/ajmc.2021.88617
A Medicaid managed care organization (MCO) developed a machine learning model to identify opioid use disorder (OUD) risk factors and predict OUD incidence in its multistate population.
Medicaid plays a vital role in providing health care services to low-income populations disproportionately affected by the opioid epidemic in the United States. More than 2 million nonelderly adults have been diagnosed with opioid use disorder (OUD), and approximately 47,600 overdose deaths occurred in 2017.1 OUD is defined as a persistent and problematic pattern of opioid use, characterized by the inability to reduce or stop consumption and/or interference with work and life responsibilities.2 Of all nonelderly adults diagnosed with an OUD, 38% rely on Medicaid for health insurance.3 This is more than twice the national proportion of nonelderly adults enrolled in Medicaid (0.15).4 Medicaid beneficiaries are also at increased risk of opioid-related overdose, particularly as cheaper and more lethal alternatives to prescription drugs (such as heroin and fentanyl) become available.5,6 To combat this problem, Medicaid spending on OUD-related medications increased by 250% and spending on mental health treatment is anticipated to soar by an additional $19.49 billion between 2014 and 2020.7,8 Despite the growing investment in programs and interventions, the rate of OUD and opioid-related deaths remains high.
At AmeriHealth Caritas, a national Medicaid managed care organization (MCO), we developed an Opioid Blueprint to reduce the magnitude of the epidemic among our members. The Opioid Blueprint targets interventions at the pharmacy, provider, and member levels. We have established programs that have successfully reduced the overall number of opioid prescriptions while increasing the availability of medications for OUD. Although progress has been made, we saw an opportunity to more proactively identify and address physical, behavioral, and social risk factors of OUD.
To do so, we aimed to leverage data in new ways to proactively engage our membership and reduce adverse events. We decided to use machine learning to generate actionable insights into OUD risk factors and to prioritize our Medicaid populations for outreach. Researchers have used machine learning methodologies to discover that, in general and commercial populations, individuals with a mental health diagnosis, HIV, hepatitis C, substance use disorder, and tobacco use had greater odds of OUD and/or risk of overdose.9-12 Machine learning was also successfully used to predict a stratified probability of overdose in Medicare populations.13 This capability of machine learning allows us to strategically target members who have greater than 50% odds of developing an OUD; this helps us, as an MCO, focus resources and proactively intervene with members before OUD diagnosis. We acknowledged that our decision to utilize a generous probability threshold could affect model sensitivity. However, we believed that the use of such a threshold is the surest approach to reduce incidence of OUD and other related adverse events.
Our goals in this paper were (1) to develop and validate a generalizable predictive model identifying the primary risk factors and predictors of OUD in our multistate adult Medicaid population and (2) to calculate probability of OUD diagnosis, creating an actionable list for holistic, person-centered outreach to avoid OUD occurrence.
Design and Sample
We used our Medicaid enrollment, medical, pharmacy, and care management administrative data from January 1, 2016, to December 31, 2018. We included data from the District of Columbia, Florida, Louisiana, Michigan, Pennsylvania, and South Carolina in a staged study design to develop and test the predictive accuracy of our model, as displayed in Figure 1. For the training model, we used complete 2016 data to predict OUD in 2017 (n = 320,040). We validated predictors through application of the training model to 2017 data, predicting OUD in 2018 (n = 374,809). Finally, we applied our finalized model to 2018 data to predict future OUD in 2019, called the scoring model (n = 589,423). We limited the study population to all continuously enrolled adults to increase likelihood of observing all OUD-related diagnoses within each cohort.
Outcome Variable: OUD
We identified OUD by applying the National Committee for Quality Assurance’s Healthcare Effectiveness Data and Information Set (HEDIS) criteria. HEDIS is a nationally recognized set of quality, access, process, and outcome metrics used to evaluate MCOs’ performance. We used the metric “Initiation and Engagement of Alcohol and Other Drug Abuse or Dependence Treatment (IET)” to focus on OUD-related treatment indicated by an International Classification of Diseases, Tenth Revision diagnosis code beginning with F11 (opioid-related disorders). Specific criteria for outcome identification included an opioid use and dependence diagnosis during any of the following events: (1) an emergency department (ED) visit; (2) an inpatient (IP) discharge; (3) a telephone visit; (4) an online assessment; (5) a detoxification visit; or (6) an outpatient, ED, or IP visit with IET stand-alone visit, IET Visits Group 1 with IET Place of Service Group 1, or IET Visits Group 2 with Place of Service Group 2. IET stand-alone visit and IET Visits groups 1 and 2 contain different revenue codes, Current Procedural Technology codes, and Healthcare Common Procedure Coding System codes.14
As part of our daily operations, we utilize a curated list of patient-level characteristics to predict the likelihood of adverse outcomes and unexpected events. We capture member age, gender, race, language, and aid category from the enrollment data. We apply the Johns Hopkins ACG Care Analyzer data analytics system to medical and pharmacy claims data to create disease indicators for a multitude of physical and behavioral health conditions. To supplement the demographic and clinical information, we include an indicator for social determinants of health (SDOH) vulnerability as reported by members via care management outreach. If a member indicates any issues with housing, food, clothing, transportation, education, utilities and/or literacy, they are deemed to have an SDOH vulnerability.
To enhance applicability to our specific outcome variable for this study, we included a series of variables related to opioid prescriptions. We identified opioid prescriptions using the Medi-Span National Drug Codes for Analgesics–Opioid. Variables included an indicator of any opioid prescription, total number of pain medication prescriptions, daily morphine milligram equivalent (MME) dosing of greater than 120 MME for 5 contiguous days, and short-acting or long-acting opioid. Long-acting opioids include naturally long-acting versions and extended-release formulations. We added predictor candidates for the number of prescribers of opioids, number of pharmacies, and an indicator of high-risk utilization; this indicator was defined as 1 or more visits to a mental health specialist, 1 or more mental health IP admissions, or 1 or more ED visits in the observation year. We also captured alcohol, opioid, and other substance use disorder in the baseline year for each cohort.
Due to the skewed distribution of included continuous predictor candidates, we decided to create categorical variables where possible. We reviewed the distribution of each continuous variable and collapsed the data to create evenly distributed categories.
Using our staged approach, we began developing and predicting algorithms using a series of machine learning approaches in our training model, including multivariate logistic regression, decision trees, random forest, neural networks, support vector machine, and other high-performing data mining. We compared the receiver operating characteristic (ROC) curve index, misclassification rate, and interpretability to select the best model and use it for record scoring.
In Table 1, we display the prediction performance metrics of the training and validation models. The misclassification rate was less than 0.019 for all algorithms. The ROC index remained high across the fitted algorithms (> 0.904) except the decision tree model, which was automatically generated. Due to the similar prediction performance, we decided to use the multivariate logistic regression (validation C statistic = 0.914; misclassification = 0.019) to identify predictors of OUD risk. The selected model allowed for the interpretation of each risk factor’s magnitude. At a 50% predicted probability, the multivariate logistic regression validation model yielded a sensitivity of 28.3%, a specificity of 99.6%, and a false-positive rate of 38.0% (Table 2).
After selecting the logistic regression model, we applied a backward elimination method, an approach that eliminates the least significant variables to create a high-performing, parsimonious model (eAppendix Table 1 [eAppendix available at ajmc.com]). We reviewed the remaining variables for clinical and industry relevance to OUD diagnosis. The removed predictor candidates included several chronic condition indicators, the number of opioid prescribers, and the number of pharmacies filling opioid prescriptions. The remaining variables are presented in Table 3. All analyses were conducted using SAS EG 7.1 and SAS EM 14.1 (SAS Institute). A P value less than .05 was deemed statistically significant.
Our training cohort had more women (64.0%), adults 40 years and older (48.8%), African Americans (41.7%), and English speakers (85.9%) than the scoring cohort (61.8%, 45.8%, 36.6%, and 76.4%, respectively). Disease prevalence was higher in our training cohort than our scoring cohort across all included conditions. Prior alcohol use disorder and other substance use disorder were consistent across the 3 cohorts, whereas prior OUD increased in prevalence from the training (1.7%) to scoring (2.4%) cohorts. In the training model, 6269 members developed an OUD in 2017 (2.0%), which remained constant in 2018 (2.1%). Results are displayed in Table 3.
Factors Associated With Risk of OUD
Using multivariate logistic regression (Figure 2), we found that Caucasians (odds ratio [OR], 1.65; 95% CI, 1.53-1.77), men (OR, 1.57; 95% CI, 1.48-1.67), and those younger than 40 years (aged 40-64 years compared with 18-39 years: OR, 0.75; 95% CI, 0.67-0.85; ≥ 65 years compared with 18-39 years: OR, 0.56; 95% CI, 0.42-0.75) had greater odds of being diagnosed with an OUD. Members who do not qualify for Medicaid through Social Security Income (SSI) benefits had 12% higher odds of having OUD than those with SSI (95% CI, 1.05-1.20), and members with an SDOH vulnerability had 26% higher odds of having an OUD than those without a documented SDOH issue (95% CI, 1.13-1.40). Clinically, diagnoses of bipolar disorder (OR, 1.39; 95% CI, 1.27-1.51), depression (OR, 1.82; 95% CI, 1.69-1.97), seizures (OR, 1.28; 95% CI, 1.20-1.37), anxiety (OR, 1.49; 95% CI, 1.49-1.60), sickle cell disease (OR, 1.69; 95% CI, 1.14-2.48), and HIV (OR, 1.58; 95% CI, 1.35-1.85) were associated with greater odds of having an OUD.
We found that members with documented tobacco use (OR, 2.68; 95% CI, 2.51-2.86), prior substance use disorder (OR, 1.69; 95% CI, 1.54-1.85), or prior OUD (OR, 17.79; 95% CI, 16.50-19.18) had greater odds of an OUD diagnosis than members without those traits. In regard to prescribing behavior, we found that having an opioid dose greater than 120 MME for 5 contiguous days increased odds of OUD by 1.87 times (95% CI, 1.69-2.07) and that a 30-day or longer supply of opioids increased the odds by 1.56 times (95% CI, 1.43-1.71). Prescriptions for long-acting opioids and short-acting opioids also increased the odds of a member developing an OUD by 2.8 (95% CI, 2.46-3.20) and 1.1 (95% CI, 1.01-1.20) times, respectively, compared with not having an opioid prescription.
Using our model to score the 2018 cohort, we identified 6210 members (1.05%) with higher than a 50% predicted probability of an OUD diagnosis in 2019. All predicted probability categories and count of identified members can be found in eAppendix Table 2.
This study highlights key demographic, clinical, and prescribing predictors of OUD in an adult Medicaid population. We used administrative claims data to develop machine learning models, selecting logistic regression as our statistical approach due to its high ROC index, low misclassification rate, and ease of interpretability. Overall, we found that male members, those younger than 40 years, and those with a behavioral health diagnosis, HIV, or a history of various substance use disorders (tobacco, opioid, or other substances) had increased odds of developing an OUD. Related to care delivery, negative prescribing patterns were also associated with OUD diagnosis. These results align with previously conducted studies of commercial and general populations and other national statistics on those affected by the opioid epidemic.5,9-12,15
Specific to our Medicaid population, we found that individuals eligible for Medicaid due to income alone are more likely to have OUD than those eligible for SSI benefits (aged, disabled, and/or blind). SDOH vulnerability was also a strong predictor of OUD. Medicaid beneficiaries are disproportionately affected by OUD, and this finding indicates that those members experiencing greater social and economic hardships are at even further risk. Our results parallel those of a recent study, which showed that areas experiencing a major economic downturn observed an increase in opioid-related deaths.16 As a Medicaid MCO, our approach cannot simply focus on clinical need and utilization. Rather, we have to continue to address the structural barriers affecting quality of life and overall well-being through holistic, person-centered solutions and health and social service provider–payer collaboration.
As we begin to incorporate predictive analytics in our care management strategy, it is imperative that specific consideration be paid to how the outreach conversations are conducted. Because the identified members have not yet been diagnosed with an OUD, and due to the overall sensitive nature of the condition, the focus should not be on the management of disease. We need to focus on the modifiable risk factors as identified in this study, such as tobacco use, opioid prescription use, and SDOH vulnerability. Our goal is to provide our care managers with actionable ways to help build the member’s capacity for healthier behaviors and lifestyles. For example, by providing smoking cessation support, we can introduce skills for managing addiction and eliminate a negative habit. The proactive outreach will help us to manage risk factors that may predispose a member to an OUD. It also builds the relationship of trust and support necessary for effective care management in the future.
Although our model specificity is high, the sensitivity of the model is low as a result of the low prevalence of OUD in our population. As an MCO interested in avoiding adverse outcomes, we are strategically overincluding members with a probability of developing OUD greater than 50% on our outreach list and accurately excluding those with extremely low risk of OUD incidence. By using this conservative probability estimate, we believe we are best able to proactively engage members to prevent OUD from ever occurring. We also decided to include prior OUD as a covariate in our model in order to catch all members and address the realities of practice. The Medicaid population has high need and high demand on their time, which can make it difficult to maintain engagement in care management programs. The inclusion of individuals with a prior OUD diagnosis in our analysis provides another opportunity for engagement. We can cross-check the list of members generated through this model to those actively working with a care manager to focus efforts on nonengaged members.
Additionally, we recognize that care is local, and program structure and prior authorization policies vary across states. We work closely with all our state partners to improve appropriate utilization, collaborate with community organizations, and introduce auxiliary services as needed. Our care managers often come from the same community as the members they are serving. This provides an invaluable asset to our organization in connecting members with local resources and programs. We intend to work with our state partners to continue to support a comprehensive and person-centered approach to OUD.
Our organizational response to the opioid epidemic provided the catalyst to examine the integration of predictive analytics into care management processes, which traditionally focus outreach on the members at highest risk. We are currently advancing our analytic capabilities and evaluating how data are informing our person-centered approach, and we will be applying the list to future member engagement strategies. This study helped to build the necessary infrastructure to create a manageable list for outreach. The approach can be replicated for other adverse events and generate actionable lists for intervention.
As we integrate the results into our workflow, we will continue to refine our machine learning models to improve predictive accuracy and to indicate which members experience significant increases in probability of OUD development. Results will inform and modify future person-centered outreach strategies.
We were limited by our use of administrative claims data, which allowed us to only measure the impact of observable characteristics and view the outcome among members receiving care. Next, we utilized data from an entire observation year to predict a member’s probability of OUD in the following year. This created variation in the window of time from observation to diagnosis of OUD, weakening the link between specific predictor variables and the outcome. Additionally, OUD prevalence in our population is low, which has resulted in an elevated false-positive rate. We believe that with more data over time, we will be able to refine the model and correct that issue.
We assumed that the disease indicators generated in the Johns Hopkins ACG Care Analyzer data are lifelong conditions. Similarly, we chose to include an indicator for SDOH collected during outreach by a care manager, resulting in a large number of missing values. We combined missing and “none identified,” underreporting SDOH issues among our membership.
In the development of the models, we chose to ignore interactions and focus on main effects to ease data interpretation. We removed some chronic conditions common in the Medicaid population (eg, diabetes, asthma) from our final model, which did not affect model robustness. The inclusion of line of business indicators as proxies for state-level effects only slightly increased the model performance (C = 0.917) and did not affect the ORs (eAppendix Table 3). Because of this, we chose to exclude state-level proxies. We also opted to use internally collected person-level SDOH data opposed to state-level socioeconomic data due to our person-centered unit of analysis and our sole focus on the Medicaid population, whose unique experiences and vulnerabilities are better captured through personal outreach. Finally, we conducted some sensitivity analyses in which prior OUD diagnosis was removed (eAppendix Table 4). The model’s significant predictors and performance remained similar.
As states, health plans, and providers collaborate to reduce the impact of OUD on communities, it is critical to effectively target and deploy resources. We decided to more proactively address the physical, behavioral, and social risk factors associated with the development of OUD. Through machine learning methodologies, we generated predicted probabilities of OUD among our members, creating a concise list of members with greater than a 50% probability of developing OUD over the next year. We will use this list to inform strategic person-centered outreach and enhance our internal efficiency.
The authors would like to thank David Keleti, PhD, for his contributions to the early development of this project.
Author Affiliations: AmeriHealth Caritas Family of Companies (WG, CL, YC, JJ, PM), Philadelphia, PA.
Source of Funding: None.
Author Disclosures: The authors report no relationship or financial interest with any entity that would pose a conflict of interest with the subject matter of this article.
Authorship Information: Concept and design (WG, JJ, PM); acquisition of data (WG, YC); analysis and interpretation of data (WG, CL, YC); drafting of the manuscript (WG, CL); critical revision of the manuscript for important intellectual content (WG, CL, YC, JJ, PM); statistical analysis (WG); provision of patients or study materials (WG); administrative, technical, or logistic support (WG, JJ, PM); and supervision (WG, JJ, PM).
Address Correspondence to: Wanzhen Gao, PhD, MD, MPH, AmeriHealth Caritas Family of Companies, 200 Stevens Dr, Philadelphia, PA 19113. Email: firstname.lastname@example.org.
1. Trends & statistics: overdose death rates. National Institute on Drug Abuse. Accessed September 15, 2019. https://www.drugabuse.gov/related-topics/trends-statistics/overdose-death-rates
2. Diagnostic and Statistical Manual of Mental Disorders. 5th ed. American Psychiatric Association; 2013.
3. Orgera K, Tolbert J. The opioid epidemic and Medicaid’s role in facilitating access to treatment. Kaiser Family Foundation. May 24, 2019. Accessed December 1,2020. https://www.kff.org/medicaid/issue-brief/the-opioid-epidemic-and-medicaids-role-in-facilitating-access-to-treatment/
4. State health facts: health insurance coverage of adults 19-64: 2018. Kaiser Family Foundation. Accessed September 15, 2019. https://www.kff.org/other/state-indicator/adults-19-64
5. Medicaid and the opioid epidemic. In: Medicaid and CHIP Payment and Access Commission. Report to Congress on Medicaid and CHIP. Medicaid and CHIP Payment and Access Commission; 2017.
6. Bohm MK, Bridwell L, Zibbell JE, Zhang K. Heroin and healthcare: patient characteristics and healthcare prior to overdose. Am J Manag Care. 2019;25(7):341-347.
7. Clemans-Cope L, Epstein M, Lynch V, Winiski E. Rapid growth in Medicaid spending and prescriptions to treat opioid use disorder and opioid overdose from 2010 to 2017. Urban Institute. February 2019. Accessed September 17, 2019. https://www.urban.org/sites/default/files/publication/99798/rapid_growth_in_medicaid_spending_and_prescriptions_to_treat_opioid_use_disorder_and_opioid_overdose_from_2010_to_2017_1.pdf
8. Projections of national expenditures for treatment of mental and substance use disorders, 2010-2020. Substance Abuse and Mental Health Services Administration. October 2014. Accessed December 1, 2020. https://store.samhsa.gov/product/Projections-of-National-Expenditures-for-Treatment-of-Mental-and-Substance-Use-Disorders-2010-2020/SMA14-4883
9. Glanz JM, Narwaney KJ, Mueller SR, et al. Prediction model for two-year risk of opioid overdose among patients prescribed chronic opioid therapy. J Gen Intern Med. 2018;33(10):1646-1653. doi:10.1007/s11606-017-4288-3
10. Ellis RJ, Wang Z, Genes N, Ma’ayan A. Predicting opioid dependence from electronic health records with machine learning. BioData Min. 2019;12:3. doi:10.1186/s13040-019-0193-0
11. Cochran BN, Flentje A, Heck NC, et al. Factors predicting development of opioid use disorders among individuals who receive an initial opioid prescription: mathematical modeling using a database of commercially-insured individuals. Drug Alcohol Depend. 2014;138:202-208. doi:10.1016/j.drugalcdep.2014.02.701
12. Dufour R, Mardekian J, Pasquale MK, Schaaf D, Andrews GA, Patel NC. Understanding predictors of opioid abuse: predictive model development and validation. Am J Pharm Benefits. 2014;6(5):208-216.
13. Lo-Ciganic WH, Huang JL, Zhang HH, et al. Evaluation of machine-learning algorithms for predicting opioid overdose risk among Medicare beneficiaries with opioid prescriptions. JAMA Netw Open. 2019;2(3):e190968. doi:10.1001/jamanetworkopen.2019.0968
14. Initiation and Engagement of Alcohol and Other Drug Abuse or Dependence Treatment (IET). National Committee for Quality Assurance. Accessed January 3, 2020. https://www.ncqa.org/hedis/measures/initiation-and-engagement-of-alcohol-and-other-drug-abuse-or-dependence-treatment/
15. Hasan MM, Noor-E-Alam M, Patel MR, Sasser A. A novel big data analytics framework to predict the risk of opioid dependency. arXiv. Preprint posted online May 31, 2020.
16. Venkataramani AS, Bair EF, O’Brien RL, Tsai AC. Association between automotive assembly plant closures and opioid overdose mortality in the United States: a difference-in-differences analysis. JAMA Intern Med. 2020;180(2):254-262. doi:10.1001/jamainternmed.2019.5686