Risk-Stratification Methods for Identifying Patients for Care Coordination
Lindsey R. Haas, MPH; Paul Y. Takahashi, MD; Nilay D. Shah, PhD; Robert J. Stroebel, MD; Matthew E. Bernard, MD; Dawn M. Finnie, MPA; and James M. Naessens, ScD
With rising prevalence of chronic disease, patient-centered medical homes (PCMHs) were created to deliver higherquality, more cost-effective primary care in the United States.1 A key component of PCMH is the implementation of care coordination, especially if targeted to the right people.2 While some states have developed risk-adjusted care coordination payments, methods for effective identification of individuals likely to benefit from care coordination are not clear.3 Patients with multiple chronic conditions may see many providers, often in an uncoordinated manner, adding to their increased risk of high healthcare utilization through test duplication, edication management conflicts, and medical errors.4-6 These patients are also at increased risk for hospital admissions and emergency department (ED) visits. Primary care doctors may not routinely have the time or resources to properly manage these complex patients who have multiple chronic health problems. Thus, care coordination for the right patients could decrease unnecessary care and prevent adverse outcomes.
Most existing PCMH programs identify care coordination patients through physician referrals.7,8 However, this method may not identify all potential beneficiaries. Care coordination should be focused on patients who will benefit most, maximizing the impact on both quality and costs. Risk-stratification models can be efficient tools for both providers and health plans to screen populations and select individuals for care management programs who are at risk of future hospitalization and functional decline.9-11 Several methods using administrative data have been described to help with the identification of high-risk patients.12 However, it remains unclear which method will perform best in identifying patients in need of care coordination. To answer this important and fundamental question, we compared the performance of 6 common risk-adjustment methods in predicting hospitalizations, ED visits, 30-day readmissions, and high expenditures. This study extends the existing literature on patient selection for care coordination with results from direct comparison of the following models: Adjusted Clinical Groups (ACGs), Hierarchical Condition Categories (HCCs), Elder Risk Assessment (ERA), Chronic Comorbidity Count (CCC), Charlson Comorbidity Index, and Minnesota Health Care Home Tiering (MN Tiering). The identification of an optimal method may guide implementation of care coordination programs within PCMHs. Furthermore, identifying these predictable high-utilization cases may lead to other effective interventions.
A retrospective cohort analysis compared multiple methods to identify high-risk patients. These methods were applied to all primary care patients 18 years or older empaneled to the Employee and Community Health (ECH) practice (family medicine, primary care internal medicine, and community pediatric and adolescent medicine) at Mayo Clinic, Rochester, Minnesota, the community-focused primary care arm of a large integrated multispecialty group practice.
To be a patient of ECH, one must be a Mayo Clinic employee or live in the community, identify an ECH primary care provider, and be enrolled within his/her panel. Participants assigned to a primary care provider for all 12 months in 2009 (base year) and throughout 12 months or until death in 2010 (assessment year) based on the electronic medical record were included in the analysis. The organized development of care coordination within ECH occurred during 2011; hence, usual care (encounter driven) was provided to all patients during the study. Subjects were excluded if they refused consent for medical record review in accordance with Minnesota state law.
All information was electronically abstracted from the electronic medical record and administrative databases within Mayo Clinic’s health records system. Mayo Clinic maintains all electronic medical record information within 1 system, including inpatient and outpatient visit information.13
Demographic variables collected in the base year included age, sex, marital status, and insurance status. Age was grouped into 4 categories of 18 to 44 years, 45 to 64 years, 65 to 84 years, and more than 85 years. Insurance status was also grouped into 4 categories: public (Medicare or Medicaid), private, no insurance, and unknown. Diagnosis codes (International Classification of Diseases, Ninth Revision, Clinical Modification [ICD-9-CM]) for each patient encounter, as well as utilization and cost information, were extracted from institutional billing data for year 2009. Number of hospital days was also retrieved from 2008 to calculate ERA scores. We included all diagnosis codes from hospitalizations, ED visits, and primary and specialty care evaluation and management visits.
Adjusted Clinical Groups. The ACG is a commonly used classification system based on administrative diagnosis data and developed by Johns Hopkins to measure morbidity.14 The ACG methodology was developed to predict the utilization of medical resources using the presence or absence of specific diagnoses from both inpatient and outpatient services for a specified period of time, along with age and sex, to classify each person into 1 of 93 discrete ACG categories with similar expected utilization patterns. Adjusted Clinical Groups can be used to improve accuracy and fairness in forecasting healthcare utilization15 and have been found to predict inpatient hospitalizations as well as or better than other case-mix tools in many health systems.9
Minnesota Tiering. The MN Tiering model is based on a product of ACGs, Major Extended Diagnostic Groups.3 MN Tiering is currently being used by the State of Minnesota to determine management fees for care coordination among medical home plans. The purpose of MN Tiering is to group patients into “complexity tiers” based on the number of major condition categories from which they suffer.3 The total sum of conditions is grouped into the following 5 patient complexity levels: low (tier 0): 0 conditions; basic (tier 1): 1 to 3 conditions; intermediate (tier 2): 4 to 6 conditions; extended (tier 3): 7 to 9 conditions; and complex (tier 4): 10 or more conditions.
Hierarchical Condition Categories. In 2004, the Centers for Medicare & Medicaid Services (CMS) implemented HCCs to adjust Medicare capitation payments for health expenditure risk of Medicare Advantage Plan (health maintenance organization) enrollees.16 In the HCC model ICD-9-CM diagnosis codes and demographic data for each patient are aggregated into 70 condition categories that contribute to a single risk score.17 Several studies among Medicare patients have provided evidence that HCC scores for risk adjustment can be effective at predicting hospitalizations.18,19 Elder Risk Assessment Index. The ERA Index was developed to identify patients at risk for hospitalization and ED visits in adults 60 years or older.20 The ERA Index incorporates a weighted score of age, sex, number of hospital days in the prior 2 years, and marital status, as well as selected medical conditions (diabetes, coronary artery disease, congestive heart failure, stroke, chronic obstructive pulmonary disease,and dementia).20 The minimum score on the index is –1 and the maximum score possible is 34.
Chronic Condition Count. The CCC method was derived by Naessens and colleagues21 from a modification of the method of Hwang and colleagues,22 based on the publicly available Agency for Healthcare Research and Quality’s Clinical Classification Software. The CCC method is more comprehensive than most comorbidity counts. The total sum of chronic conditions was grouped into 6 categories: 0, 1, 2, 3, 4, and 5 or more. Comorbidity counts have been shown to be associated with high annual costs as well as persistence in high costs.21
Charlson Comorbidity Index. The Charlson Comorbidity Index, originally derived to classify comorbidities affecting 1-year mortality in cancer patients, sums weights for 17 specific conditions.23 In our study, the sum of Charlson Comorbidity Index counts was used to predict future outcomes. The performance of the Charlson Comorbidity Index in predicting poor outcomes has been assessed in various large populations, and its validity as a prognostic measure of outcomes has been consistently demonstrated.24
Hybrid Model. A care coordination patient enrollment process was developed in our primary care practice based on combining MN Tiering and the ERA score as a hybrid model. Patients eligible for enrollment in care coordination included all patients in MN tier 4 (individuals with more than 10 comorbid conditions) and those in MN tier 3 with ERA scores greater than 10. Although all of the previously mentioned risk-stratification instruments have been validated, it remains unclear how the hybrid method of combining MN Tiering and ERA score will compare with them.
Several outcome measures could be used as surrogates for patients who may benefit from care coordination. We modeled 4 binary outcomes reflecting high utilization within the 12-month prediction period of 2010: (1) any inpatient hospitalization, (2) ED visits not resulting in hospitalizations, (3) any readmission within 30 days of an initial hospitalization, and (4) being a high-cost user. A high-cost user wasdefined using the threshold of the top 10% users in 2010 (>$7457). We chose to focus on the top 10% of users because they consume almost 70% of total costs, but are less influenced by catastrophic conditions and potentially more amenable to the effects of care coordination than patients at a higher threshold. These outcomes are utilization measures believed to be influenced by effective care coordination. 25 ED visits resulting in a hospitalization were recorded as a single inpatient hospitalization event. Total costs reflect a provider perspective and were based on standardized inflation-adjusted cost estimates for each service or procedure provided within our healthcare system (not including outpatient pharmacy services) in 2010 constant Medicare dollars.26
We applied logistic regression models to compare the 6 risk-stratification methods to determine which method best predicted the dichotomous outcomes in the subsequent year. Model performance was based on explanatory power and goodness of fit. Explanatory power was assessed by using the C statistic with 95% confidence intervals to predict (1) hospitalizations, ED visits, 30-day readmissions, and high-cost users and (2) the ability of each model to identify individuals with the outcomes of interest in the highest and lowest predicted deciles. The C statistic is a measure of model discrimination and is equivalent to the area under the receiver operating characteristic curve.27 To assess goodness of fit, we compared the observed and predicted hospitalizations, ED visits, readmissions, and high-cost users in the lowest and highest deciles on the basis of predicted probabilities.28 To address calibration, we further focused our assessment on those patients at the highest end of each risk score to identify which patients need care coordination. Since the hybrid model was used clinically in 2011, we used the number of patients considered for PCMH enrollment as a target number to establish the threshold for potential care coordination (based on the highest estimated probability of hospitalizations). We then determined (1) how much overlap in identification occurred in each of the 7 methods and (2) how much of the total utilization the 6 other approaches would have identified. Analyses were conducted with SAS software, version 9.1 (SAS Institute Inc, Cary, North Carolina). The study was approved by the Mayo Clinic Institutional Review Board. No external funding was obtained.
The study population included 83,187 patients who met inclusion criteria between January 1, 2009, and December 31, 2010. The mean age of the base population was 46.9 years, 54.6% were female, 63.1% had private insurance, and 21.8%
had Medicare and/or Medicaid coverage (Table 1). Table 1 shows the frequency distributions of the demographic characteristics at the end of 2009, as well as the percentage of paneled patients with selected chronic diseases.
Healthcare utilization and resource use for the base and prediction years were similar (Table 1). Approximately 8% of patients had a hospitalization, and 13% percent of the cohort had an ED visit. Compared with the base year 2009, the mean total cost in 2010 was nearly the same, decreasing by 1%. We saw the expected concentration of healthcare services among a relatively small number of individuals (Figure 1). Overall, 32.4% of the most expensive 10% of patients in 2009 were also in the top 10% of patients in 2010. Furthermore, our outcomes of interest were correlated, though not perfectly. High cost was associated with each of the utilization measures, with Pearson correlations ranging from 0.22 for ED visits to 0.72 for hospitalizations. The correlations between utilization measures ranged from 0.12 between ED visits and readmission to 0.35 between hospitalization and readmission.
As shown in Table 2, the ACG model outperformed the other 5 models in predicting hospitalizations, with a C statistic range of 0.67 (CMS-HCC) to 0.73 (ACG). The ERA score and MN Tiering followed close behind ACG for prediction of hospitalization (C statistic = 0.71). In the models predicting ED visits, the C statistic ranged from 0.58 (CMS-HCC) to 0.67, with the ACG model having the best predictive ability. Further, the ACG model outperformed other models when predicting 30-day readmissions; the C statistic ranged from 0.74 (CMS-HCC) to 0.81 (ACG). When predicting healthcare expenditures for the top 10% high-cost users, the performance of the ACG model was good (C statistic = 0.76) and superior to that of other models. CMS-HCC had the lowest predictive ability for all 4 outcomes. It is important to point out that although ACGs had the best predictive ability, much of the variability in outcomes was unexplained by any model. For each outcome, models with higher C statistics also had higher rates of actual events among the highest deciles. For example, the top decile for ACG models had 27% with a hospitalization and 31% with at least 1 ED visit, whereas the top decile for CMS-HCC models had 25% with a hospitalization and 23% with an ED visit. There was more potential discrimination with ACG.
To further evaluate the 6 methods, we compared the patients in the top decile of predicted probability of having a hospitalization (Figure 2). The ERA method tended to overpredict for those in the top decile, whereas the CMSHCC method underpredicted. For the ACG, MN Tiering, and CCC methods, the actual and predicted hospitalizations were nearly equivalent. Using the model based on ACGs, 26.8% of patients in the highest decile were hospitalized; using the MN Tiering model, 25.1% in the highest decile were hospitalized; and using the Charlson Comorbidity Index, only 22.9% in the highest decile were hospitalized. Similar results were seen for the other 3 outcomes (results are available in eAppendicesA, B, and C, available at www.ajmc .com).
A total of 2347 (2.8%) patients were identified as meriting care coordination based on the hybrid clinical approach (Table 3). At least 40% of these base patients were in the selected top group of patients, irrespective of which risk method was used. Interestingly, our initial clinical implementation actually identified the patients with the highest number of hospitalizations, the highest percentage of patients with any ED visit, and the patients with highest total costs.
We assessed 6 risk instrument methods based on administrative and demographic data. We evaluated the performance of the 6 models against one another to assess the ability to predict future healthcare utilization. We concluded that the ACGs produced a more accurate prediction of future healthcare utilization relative to the other models.
All risk prediction models for hospitalization had fair predictive value, with ACG having the highest overall predictive C statistic at 0.73 and the HCC model having the lowest predictive C statistic at 0.67. In a previous large study, the ACG had an excellent predictive area under the curve of 0.80.29 An earlier study on the HCC showed an area under the curve of 0.638 for predicting hospitalizations among newly enrolled Medicare patients.16 MN Tiering, CCC, and the ERA all performed similarly; thus, use of each could be justified. These novel and unique findings indicate to both providers and health plans that any of these risk-stratification models can be used for clinical purposes. The individual risk instruments performed in a similar fashion for ED visits, readmissions, and high-cost users as well as hospitalizations, as might be expected, because these outcomes tend to be correlated. The predictive values of the risk-stratification instruments were slightly lower for ED use, but higher for 30-day readmission and high-cost users. These findings provide important information regarding use of newer risk-stratification tools like MN Tiering, ERA, and CCC. The instruments predicted not only hospitalization but also re-hospitalization and ED visits.
Although all rely on the presence of diagnoses and demographic factors, the 6 risk-screening instruments vary with respect to ease of implementation. Unfortunately, the best performing methods, ACG and MN Tiering, are also the only methods we examined that require software licensing. A form of MN Tiering based on manual classification is available.30 CMS-HCC is a software package that can be downloaded from CMS. The algorithms for the ERA Index, CCC, and Charlson Comorbidity Index have all been published and are available, but need to be programmed to be applied for local clinical use.
When we compared top-scoring patients identified by our hybrid model with patients identified by other models, there was substantial overlap, resulting in similar rates of hospitalization, percentages with ED visits, and mean total costs. Of the individuals identified as high risk by the hybrid model, 41% had a hospitalization, compared with a low of 34% of individuals identified as high risk with the Charlson Comorbidity Index. Although our findings suggest that any risk-stratification model has some value in identifying high-risk individuals, ACGs and MN Tiering performed better than Charlson Comorbidity Index or CMS-HCC scores on all 4 outcomes, whereas ERA and CCC scores performed in the middle. Because the Charlson Comorbidity Index was developed to predict 1-year mortality, it might not predict utilization and cost outcomes as well as other instruments. CMS-HCC and ERA scores focus on the Medicare population and the elderly, respectively, and may perform less well with the general adult population.
Because a variety of risk instrument methods are available, our results are important. They can help guide the choice of risk instrument best suited for identifying those patients who may benefit from care coordination or other PCMH interventions. Other risk stratification methods exist, but some proprietary methods were not available for this study.
This study has clear strengths and weaknesses. It utilized an entire population of adult primary care patients who receive their care in an integrated system with hospital and ED access. With study data restricted to provider sources, there is a risk that patients may have received care outside the Mayo Clinic system. This may have lessened the study’s predictive ability; however, those identified as high risk would still be clinically relevant. The potential for this bias to systematically alter the results is small, given that most adults in Mayo Clinic ECH panels receive all of their care at the Mayo Clinic.
Our reliance on provider data precluded the use of outpatient pharmacy data. Pharmacy is an important component of total costs of care. Additionally, pharmacy data may aid predictive models based on medical claims information. However, because we focused on comparing multiple medical claims–based identification methods, this limitation likely caused no bias. Although the Mayo Clinic Rochester medical record system is robust and provides insight into all medical conditions seen by primary care and specialty care providers,31 diagnosis codes were based on billing data, allowing the possibility for miscoding or missing information. This limitation should not systematically favor 1 risk system over another. Lastly, our study is based on analysis of a single region and the population in Olmsted County is largely white and Northern European.32 This may limit the generalizability of the findings to other populations in the United States and around the world. Our results are consistent with those of other studies, but ideally they should be verified in other settings.
Each of these risk-stratification methods has potential advantages and disadvantages. There is considerable overlap across the risk-stratification methods because each system relies on the number of comorbid illnesses that a patient has. In fact, at least 40% of our clinically identified patients (based on combination of MN Tiering and ERA) were among the highest risk patients using any of the risk-stratification methods. However, none of the models explained more than half of the variability in outcomes—a clear limitation suggesting that other factors could enable better identification of patients who need care coordination in order to reduce the need for hospitalization and other high-cost healthcare services. Future research is needed to determine whether incorporating additional factors (eg, living situation, high-risk medications, lifestyle, patient preferences) would improve prediction of high-risk individuals or might suggest that certain risks suggest certain interventions.
We found good concordance among all 6 different risk screening instruments for predicting hospitalization. Use of any of the tools may provide some support for providers and health plans who undertake case management. Focusing care coordination efforts within the medical home on patients likely to benefit most requires appropriate identification of the highest risk, highest utilizing patients. Use of risk screening tools is a promising method.