Predicting High-Need Cases Among New Medicaid Enrollees

Self-reported health measures embedded in a Medicaid application can comprise a predictive model identifying new and returning enrollees at risk of high healthcare utilization.
Published Online: October 08, 2014
Lindsey Jeanne Leininger, PhD; Donna Friedsam, MPH; Kristen Voskuil, MA; and Thomas DeLeire, PhD
To assess the ability of a short, self-reported health needs assessment (HNA) collected at the time of Medicaid enrollment to predict subsequent utilization and costs.

Study Design
Retrospective cohort study.

We analyzed individual-level data that included self-reported HNAs, medical care encounter records, and administrative eligibility records for 34,087 childless adult Medicaid enrollees in Wisconsin, covering the period 2009-2010. High need was operationalized using the following outcome variables measured over the first year of program enrollment: having an inpatient admission; membership in the top decile of emergency department (ED) utilization; and membership in the top cost decile. We assessed the ability of the HNA to predict high-need cases using several complementary methods: the C-statistic; integrated discrimination improvement; and sensitivity, specificity, and positive predictive value resulting from multivariate logistic regression estimates.

Using the HNA along with sociodemographic measures met the Hosmer-Lemeshow criterion for adequate predictive performance for the high ED and high cost outcomes (C-statistics of 0.74 and 0.72, respectively). The HNA was associated with large improvements in predictive performance over sociodemographic measures alone for all 3 dependent variables (integrated discrimination improvement of 182%, 413%, and 300% for ED, cost, and inpatient variables, respectively). The HNA also led to considerable improvements in sensitivity and positive predictive value with no resulting decreases in specificity or negative predictive value.

Collecting self-reported health measures for a Medicaid expansion population can yield data of sufficient quality for predicting high-need cases.

Am J Manag Care. 2014;20(9):e399-e407
Take-Away Points

Predictive models of new enrollees at risk for high healthcare utilization were developed using data from a self-reported health needs assessment (HNA) administered as part of the Medicaid application.

• Self-reported HNA data can be used successfully by Medicaid agencies to prospectively classify individuals by risk of high healthcare utilization.
• Self-reported HNA data have promise for building predictive models for new and returning Medicaid populations about whom the program lacks recent utilization history.
• As large numbers of individuals lacking insurance histories enter Medicaid under the Affordable Care Act, states will need to develop such models.
Medicaid programs provide care to a population with widely varying healthcare needs. Because of these variations, appreciable benefits accrue from the ability to prospectively stratify patients into clinically distinct subgroups. Related applications, including targeted case management and the establishment of riskadjusted performance benchmarks for providers, are key tools in efforts to transform Medicaid into an outcomesfocused payer.1,2

While states differ in the extent to which they employ such techniques for their Medicaid programs,3 they all share the key constraint of lacking information on prior medical history for new enrollees, including the large expansion populations enrolled under the Affordable Care Act. Moreover,Medicaid enrollment is characterized by high levels of churn in coverage status,4,5 further complicating the challenge Medicaid agencies face in garnering recent medical histories of their members. For both new and returning program applicants, self-reported health measures collected at the time of enrollment may be the only practical means of gathering such data. To date, there is minimal evidence regarding whether states’ enrollment systems are capable of meeting the data collection task and whether the resulting data are of sufficient quality to be used for predicting highneed cases.

A recent Medicaid expansion in Wisconsin provides a unique opportunity to assess whether self-reported health measures gathered from an existing Medicaid enrollment system can provide clinically meaningful information. Wisconsin’s Medicaid program, in expanding managed care coverage to childless adults in 2009, required that applicants complete a self-reported health needs assessment (HNA) in addition to providing the sociodemographic information typically required for program enrollment.6 Our study uses administrative data from this expansion population to assess the predictive value of collecting self-reported health measures at the time of application—a novel use of Medicaid enrollment systems. To our knowledge, this is the first paper to explore the promise of using Medicaid enrollment systems data in this capacity.

Our paper tests the following 2 hypotheses:
1. HNA data considerably improve the ability to predict utilization and costs incurred over the first year of Medicaid enrollment, relative to the predictive performance of sociodemographic data typically collected by Medicaid agencies at the time of application;

2. A prediction tool comprising a combination of HNA and sociodemographic measures meets accepted thresholds of predictive ability for utilization and cost outcomes.

Assessing the predictive ability of the HNA data provides an instructive case study for other states’ Medicaid agencies, as limited empirical evidence exists regarding the predictive capacity of self-reported health measures among Medicaid members. We hypothesize that selfreported health measures are meaningfully predictive of high resource utilization among Medicaid members, in keeping with the related literature demonstrating the appreciable predictive ability of self-reported HNA instruments among populations served by Medicare and the Department of Veterans Affairs (VA).7,8

Medicaid programs nationwide have considerable experience using claims and/or encounter data for a variety of actuarial and quality measurement purposes.9 In contrast, Medicaid agencies lack experience collecting self-reported health data as part of the Medicaid application process. The potential relative benefits of this mode of data collection are large, as the marginal cost of collecting health data at enrollment is appreciably lower than fielding a population-based survey or establishing and maintaining an encounter database suitable for analytic purposes. However, there is great concern about and little evidence regarding the quality of the resulting self-reported data. Poor health status and/or poor literacy may potentially preclude enrollees from accurate reporting.10 Moreover, despite Medicaid agencies making explicit promises to the contrary, enrollees may fear that their answers could affect their eligibility for certain services.11 The presence of these and other unknown (and potentially unknowable) data quality threats demands a careful empirical examination of whether an enrollment-based data collection technique can indeed generate health-related information of sufficient caliber for programmatic purposes.


Data and Sample

Data from 2 state administrative systems were merged to construct the sample: the Client Assistance for Re-employment and Economic Support System (CARES), which stores all social program applications, and InterChange, which warehouses all claims and encounter data for Wisconsin Medicaid members. The study sample was drawn from the 48,460 enrollees who applied for the waiver program between its launch in July 2009 and the subsequent imposition of an enrollment freeze in October 2009, and who were enrolled in coverage for at least 1 year.

While the Department of Health Services (DHS) had initially intended that all waiver enrollees complete an HNA, logistical constraints precluded their universal administration. As such, the analytic sample was limited to the 34,087 members who completed an HNA at the time of enrollment. These members comprised 70% of the relevant population entering the program during the study period. DHS agency officials have shared with us that in some months case workers processing phone applications had to sacrifice HNA completion in favor of expediency, given the unanticipated magnitude of applicants (conversation with Linda McCart, director, DHS Policy and Research Section, July 2012). Members with and without HNA information have similar racial and ethnic backgrounds, but differ with respect to age and sex, with HNA respondents being older and disproportionately female (eAppendix Table). While the HNA completion rate was not universal, it compares favorably to that achieved by a similar pilot study assessing the predictive ability of a self-reported health screener collected on a VA population,8 which had a coverage rate of roughly 40%.


Emergency department (ED) visits and inpatient utilization were chosen as the primary outcomes of interest, as both of these types of care have long been the focus of Medicaid case management efforts12 and subsets of both (eg, ambulatory sensitive ED visits and hospital readmissions) are widely recognized as potential healthcare performance indicators.13,14 Accordingly, they are also the most commonly considered utilization outcomes for predictive modeling applications in Medicaid.11 Medicaid case management programs often seek to target the highest-cost cases15; as such, we examined the incurrence of high costs as an additional outcome of interest. We operationalized the dependent variables by creating the following 3 binary indicators measured over a member’s first year of Medicaid enrollment: membership in the top decile of ED utilization, which reflects having 3 or more ED visits; having at least 1 inpatient hospitalization (similar to a top decile measure, as 9.2% of the sample experienced an inpatient event); and membership in the top cost decile, which represents costs of at least $6360.


We estimated the predictive ability of 7 different sets of predictors, the first of which consisted a standard set of sociodemographic variables currently collected by Medicaid enrollment systems (see Table 1 for the complete list). Each of the remaining blocks of predictors included both the sociodemographic variables and additional variables drawn from the HNA (see eAppendix Figure for details on exact wording and progression of HNA measures). The second set of predictors included sociodemographics plus dummy variables reflecting the presence of the following conditions enumerated in the HNA: asthma; cancer; chronic obstructive pulmonary disease; depression; diabetes; emphysema; heart problems; high blood pressure; other mental health condition; and stroke. The third set included sociodemographics plus self-reported measures of behavior captured in the HNA: an indicator reflecting smoking status and an indicator reflecting problem alcohol or other drug use. The fourth set was sociodemographics plus a dummy variable reflecting high prescription drug use, measured as using 5 or more prescription drugs. Access to care indicators that reflected having a regular doctor and a regular clinic comprised, along with sociodemographics, the fifth set; sociodemographics plus a measure representing the previous year’s utilization, operationalized as having experienced an ED visit or hospitalization for one of the HNA-enumerated conditions, comprised the sixth. The seventh set of predictors was the entire vector of HNA measures (conditions + behavior + prescriptions + access to care + previous year’s utilization) plus sociodemographics.

Statistical Analysis

PDF is available on the last page.
Adult ADHD Compendium
COPD Compendium
Dermatology Compendium
Diabetes Compendium
GI Compendium
Hematology Compendium
Immuno-oncology Compendium
Lipids Compendium
MACRA Compendium
Oncology Compendium
Pain Compendium
Reimbursement Compendium
Rheumatoid Arthritis Compendium
Know Your News
HF Compendium
Managed Care PODCAST