Risk Classification of Medicare HMO Enrollee Cost Levels Using a Decision-Tree Approach

Published Online: February 01, 2004
Roger T. Anderson, PhD; Rajesh Balkrishnan, PhD; and Fabian Camacho, MS

Objective: To determine whether classification tree techniques used on survey data collected at enrollment from older adults in a Medicare HMO could predict the likelihood of an individual being in a low, medium, high, or very high cost group in the subsequent year.

Methods: Data from comprehensive health risk assessment (HRA) screening of 11 744 new enrollees (age ≥65 years) continuously enrolled in a Medicare HMO for at least 24 months between 1997 and 2002 were combined with complete healthcare service utilization data for each enrollee in the postenrollment period to create cost groups. An original clinically devised algorithm and a Classification and Regression Tree (CART) that used the HRA data were compared with respect to their ability to correctly place an individual within the cost groups over 12 months of enrollment.

Results: The variables that best classified enrollees into 12-month cost groups included quality of life, age, and preenrollment health services. Classification was best for CART: a sensitivity of 39.8% and a specificity of 83% was achieved for very high cost enrollees. Compared with the clinical algorithm, CART also utilized only one third as many predictor variables to define risk categories.

Conclusions: Brief self-report survey data collected at enrollment can be modeled to provide moderate sensitivity in predicting very high costs in the postenrollment year. The ease of applicability of software to create empirically derived profiles of high-cost enrollees precludes the need to rely on clinically devised algorithms to identify enrollees at risk for very high costs.

(Am J Manag Care. 2004;10(part 1):89-98)

Providing timely healthcare to an aging population is of increasing concern for health policymakers in the United States. Persons who survive to age 65 years today have a life expectancy of nearly 18 more years. A large number will require care for chronic diseases such as arthritis, hypertension, diabetes, and chronic obstructive pulmonary disease. These highly prevalent conditions among noninstitutionalized older adults (age ≥65 years) account for a large portion of healthcare expenditures for this age group, and contribute to an overall higher rate of consumption of healthcare resources among this age group than among any other.1,2 In 1996, the average older adult had nearly 12 annual physician visits and 6.5 days of annual hospital stay, and consumed 36% of all hospital stays and 49% of all hospital days.3,4 At the same time, managed care enrollment of older adults, which grew nearly 4-fold in the decade from 1989 to 1999, has experienced a reversing trend, with more than 2 million enrollees involuntarily disenrolled from Medicare managed care plans that withdrew from the market because of financial constraints.3 This trend brings into renewed focus the need to cost-effectively provide care by identifying high-risk older adults and providing interventions to eliminate potentially avoidable medical outcomes and associated resource utilization.

Because the health and well-being of older adults depend on both access to and timing of components and outcomes associated with care,5,6 care management efforts increasingly rely on health risk assessments to characterize enrollees' healthcare needs and the likelihood of adverse medical outcomes (eg, hospitalization, emergency care). Traditionally, "risk" has been defined in the context of becoming a high-cost user of medical care, or the propensity of enrollees to consume healthcare resources in a particular time period (eg, on an annual basis), based on total payments or billed costs.2 Individuals deemed to be at "high risk" may be referred to appropriate disease risk management programs.7-9

Although it is congressionally mandated that all the Medicare health maintenance organizations (HMOs) in the country conduct risk screening, there is little guidance on which strategy of identifying risk is most effective and beneficial to enrollees. A practical challenge is implementing a valid and reliable risk assessment process early in the enrollee's health plan tenure, when it can be most useful. Whereas claims-based strategies and clinician assessment require sufficient time in-plan to accrue patient information,10-12 self-report information supports early risk classification as it enables data collection near time of enrollment and allows elements such as inclusion of disability, self-rated health, and behavioral health variables known to predict service utilization13,14 that are not routinely recorded in medical information systems. Potential disadvantages of self-report information are possible low sensitivity of self-reported diagnosed and treated conditions,15,16 and imprecise collection of information on disease severity.

Given the potential value of self-report screening data, a critical challenge is constructing an algorithm from the output of statistical models that maximizes sensitivity and specificity of the cost group classification. Most classification studies in the literature do not involve cost groups, but have relied on basic regression techniques to identify predictors of service utilization. 8,9,11 Although these procedures are sound and valuable tools for deriving a "best set of predictors" of the outcome and percent variation explained, the model parameters that relate to probability do not easily lend themselves to risk categorization because they provide no information on the accuracy of prediction of cost groups in terms of sensitivity and specificity, and because they may not address complex interactions among variables. As a result there is considerable uncertainty over the validity and precision of converting regression weighted scores into decision rules to classify new members into actionable risk-propensity groups.

This study sought to develop and test a risk classification tool for use at the time of enrollment based on its ability to correctly classify individuals into cost groups. Subjects were older adults enrolled in a Medicare managed care plan in the southeastern region of the United States.


Study Population

A prospective cohort was drawn from 11 744 newly enrolled members in a Medicare HMO who completed the comprehensive health risk assessment battery on entry into the plan and for whom complete healthcare service utilization data were available for at least 2 years postenrollment. Claims data were use to follow this cohort annually for 2 years postenrollment. The health plan was a Medicare managed care plan in the southeastern region of United States, and was the sole provider of medical care to enrollees ("lock-in" risk benefit plan). For the purposes of this analysis, we included only subjects who had not died, moved away, or switched to a fee-for-service plan during the study follow-up period.

Each enrollee was sent a mailed questionnaire that asked questions about demographics, clinical conditions, health service utilization in the year before joining the plan, lifestyle and health behaviors, depression, disability, and health-related quality of life. Since its inception in 1996, the response rate for this questionnaire has been more than 80%. Information from the survey data was then used to derive risk profiles of patients based on a clinician-derived algorithm, which was then provided to case managers for disease state management of high-risk patients.

Demographic characteristics and health status of this population are shown in Table 1. Subjects were more likely to be female (58.7%) and had an average age of 75.1 years on enrollment. About 75% rated their general health as good to excellent, while 6% reported their health worsening in the last year (pre-enrollment year).

Table 1

Health Risk Assessment Battery

The comprehensive risk assessment tool included self-reported information on conditions known ever to have been diagnosed (heart problems, stroke, chest pain, lung problems, asthma, arthritis, diabetes, poor circulation, hypertension, cancer, bladder problems, stomach/bowel problems, other); lifestyle risk (eg, smoking status, physical activity, alcohol), as adapted from the Centers for Disease Control and Prevention Behavioral Risk Factor Surveillance System; general health (the Medical Outcomes Survey 12-Item Short-Form Health Survey [SF-12])17; functional status, assessed as impairments in instrumental activities of daily living (IADLs) and activities of daily living (ADLs) as determined by the Duke Older Americans Resources and Services (OARS) screening method18; depression (Center for Epidemiological Studies Depression scale [CES-D], short form)19; number of falls in the last year; current medication status (number and conditions treated); and use of healthcare services during the last year. This battery of measures represents key aspects of health considered to indicate need or use of health services.

Study Variables

The responses to the patient questionnaire formed the basis for the demographic and health status variables in the study and are shown in Table 2. The demographic variables were age, sex, and living alone (as opposed to with others, including caregivers, spouse, relatives, etc). The health status variables were indicators for sedentary status (physical activity and walking for at least 30 minutes a week), falls during the previous year, smoking status, alcohol consumption, presence of depression (as indicated by a score of .06 on the CES-D short form), number of difficulties in ADLs and IADLs, number of conditions not treated before enrollment, number of comorbid conditions, perception of health status, and the physical and mental subscore components of the SF-12 questionnaire. Healthcare utilization was determined from the number of annual hospitalizations, emergency department visits, and healthcare payments in the 2-year postenrollment period; this information was obtained from enrollee claims data collected by the HMO. Total healthcare payments were obtained from administrative claims records and consisted of all recorded healthcare service utilization and prescription payments. We utilized claims paid by the HMO to calculate the costs associated with each enrollee.

Table 2

PDF is available on the last page.
Adult ADHD Compendium
COPD Compendium
Dermatology Compendium
Diabetes Compendium
Hematology Compendium
Immuno-oncology Compendium
Lipids Compendium
MACRA Compendium
Neutropenia Compendium
Oncology Compendium
Pain Compendium
Reimbursement Compendium
Rheumatoid Arthritis Compendium
Know Your News
HF Compendium
Managed Care PODCAST