• Center on Health Equity and Access
  • Clinical
  • Health Care Cost
  • Health Care Delivery
  • Insurance
  • Policy
  • Technology
  • Value-Based Care

Medication, Diagnostic, and Cost Information as Predictors of High-Risk Patients in Need of Care Management

The American Journal of Managed CareJanuary 2009
Volume 15
Issue 1

Predictive models from diagnostic or medication data identify care management candidates who are more amenable to clinical interventions than groups identified using prior cost alone.


: To contrast the advantages and limitations of using medication, diagnostic, and cost data to prospectively identify candidates for care management programs.


Risk scores from prior-cost information and a set of clinically based predictive models (PMs) derived from diagnostic and medication data sources, as well as from a combination of all 3 data sources, were assigned to a national sample of commercially insured, nonelderly adults (n = 2,259,584). Clinical relevance of risk groups and statistical performance using future costs as the outcome were contrasted across the PMs.


Compared with prior cost, diagnosticand medication-based PMs identified high-risk groups with a higher burden of clinically actionable characteristics. Statistical performance was similar and in some cases better for the clinical PMs compared with prior cost. The best classification accuracy was obtained with a comprehensive model that united diagnostic, medication, and prior-cost risk factors.


Clinically based PMs are a better choice than prior cost alone for programs that seek to identify high-risk groups of patients who are amenable to care management services.

Care management is used to improve the quality and outcomes of care, while reducing costs for individuals with complex health needs. The best screening tools for selecting candidates for these programs remain unclear.

  • Prior cost is commonly used and is a good predictor of future cost.
  • Clinically based “predictive models” are a superior choice to prior-cost information alone as a screening tool for case management.

Care management programs complement conventional healthcare to improve the quality and coordination of services, optimize health, and reduce healthcare costs for individuals with chronic disease, multimorbidity, or other health risks.1-3 These programs generally tier interventions such that patients deemed “low risk” receive usual care or low-intensity services, whereas “high-risk” individuals are given a more intensive package of services.

Segmenting populations by their need for care management req uires clinically relevant and statistically valid screening tools. The screener must assign risk scores predictive of future events, identify patients across a continuum of clinically actionable health needs, and be practical to use. The most common data sources for these screening tools—commonly referred to as “predictive models” (PMs)—are healthcare claims and encounter data. Healthcare costs often are used as the outcome of interest for PMs.4-6 Cost information is easy to obtain from administrative data and has reasonable predictive accuracy.4,6 However, as a simple measure of resource consumption, cost information fails the clinical relevance criterion for a high-risk screening tool that would be useful for care management.

To overcome this limitation of prior cost, PMs based on morbidity, medication use, and other health risk measures have been developed to provide a potentially rich set of clinical predictors.7-12 This manuscript contrasts a prior-cost PM with others formed using medication and diagnostic data as potential high-risk case identification screeners for care management programs. The empirical evaluation uses PMs from the Adjusted Clinical Group (ACG) risk adjustment system,13 a well-established tool for gaining insight into population morbidity and resource use.14,15 This is the first report to describe the development of the ACG predictive models. Although others have contrasted PMs formed with diagnosis versus medication codes,11 clinically based PMs versus those based on prior cost,4,6 and combined diagnostic and medication markers in the same PM,7,11,16 this article extends this early work by using an approach organized specifically to inform the development of care management programs that seek to use healthcare claims or encounter data to segment a population by health risk.


Data were obtained from the PharMetrics Patient-Centric Data-base (PharMetrics is a division of IMS Health Incorporated). This database represents the medical and pharmacy claims and enrollment records over the continuum of medical care for approximately 85 geographically diverse health plans (Midwest, 35%; Northeast, 21%; South, 31%; West, 13%).17,18 The data were from health plans that submit data in exchange for comparative benchmarks.

The initial sample from calendar year 2002 included 5,884,632 nonelderly patients. Individuals without pharmaceutical benefits (613,098, or 10% of the total) or with medical and pharmacy benefits for fewer than 6 months (1,461,471, or 25% of the total) were excluded. The sample was further restricted to enrollees with the same health plan for a minimum of 6 months in both 2001 and 2002, which resulted in a final sample of 2,259,584, randomly divided into a 60% development sample and 40% validation sample.

All diagnosis codes assigned in outpatient or inpatient settings, excluding claims for laboratory and imaging studies, were included as inputs for the PMs. Claims information from outpatient retail pharmacies was used for the medicationbased PMs.

We used the health plans’ allowed charges for each service (ie, the total amount the plan would pay for the medication or service, inclusive of patient cost sharing) to calculate priorcost predictors and healthcare charge outcome variables. The total healthcare charge variable was the sum of pharmacy charges, inpatient facility and professional charges, outpatient facility and professional charges, and ancillary services.

Description of Predictive Models

Multivariable linear regression was used to assign risk scores by regressing year 2 charges on the PM risk factors using the development sample. Beta coefficients served as the risk factor weights. We included persons with 6 to 12 months of enrollment in each of 2 consecutive years, all persons included in our database had equal weight in the analysis, and no adjustments were made for nonusers. An individual’s PM risk score was the sum of all relevant risk factor weights. The mean of the sample’s PM risk scores was centered at 1.0, such that a score of 1.5 indicates future charges 50% higher than expected compared with the sample average, and 0.5 indicates future charges 50% lower than expected compared with the sample average. We estimated PM risk score standard deviations, medians, skewness, and ranges, and used the Spearman rank correlation coefficient to assess risk score correlations.

We hypothesized that the clinical PMs would identify more patients with chronic conditions in the highest risk groups than a prior-cost model. To test this hypothesis, we examined the clinical characteristics (age, sex, and presence of specific chronic conditions) of patients in a top 1% risk group.

Using the same top 1% risk groups, we evaluated grouplevel measures of medication use, physician ambulatory visits, hospitalization, and mean pharmacy and total charges. For this set of analyses, we hypothesized that the Rx-PM risk groups would have the highest future medication use, whereas the risk group formed with the combined DxRx-PM model with the prior-cost quantiles would have the highest healthcare charges.

We contrasted the PMs on predictive accuracy measures in which pharmacy and total healthcare charges served as the response variables. The proportion of variation in the dependent variable explained by the predictive models was assessed with the r2 statistic. To minimize the effect of extreme outliers, we capped the response variables at maximums of $20,000 and $50,000 for pharmacy and total charges, respectively. Individuals with expenditures higher than these thresholds were assigned one of these top values. Truncation of expenditures provides more robust and stable estimates than using the raw dollars, and it reflects common re-insurance practices of health plans. Classification accuracy was examined using logistic regression in which the dependent variable was assignment to the top 1% of the population in terms of 2001 (year 2) charges—yes or no. Model fit is presented as the C statistic, which ranges from 0.5 (model no better than the flip of a coin) to 1.0 (perfect true positive and true negative classification).

We did multisample validation using 200 samples of 225,000 randomly drawn cases with replacement from the validation sample. Findings from these bootstrapping analyses showed tight intervals between the 2.5 and 97.5 percentiles, demonstrating good stability of predictive accuracy statistics.


The purpose of this study was to conduct empirical assessments of alternative PMs that identify high-risk patients to make recommendations on what type of model best suits the needs of a given care management program. Compared with prior cost alone, the clinically based PMs derived from demographic, diagnostic, and medication data identify high-risk groups with more chronic disease, greater medication usage and polypharmacy, and higher healthcare costs. Another study using Dx-PM among elders found that it identified a high-risk group that had a greater burden of chronic disease, more functional limitations, and poorer perceived health than the lower-risk counterparts.21 Clinically based PMs are likely to provide greater yield per resources expended in care management programs than prior cost alone, because they are more accurate at identifying high-cost risk groups and selecting high-risk individuals who are more amenable to clinical interventions. Although beneficial effects of care management have not been consistently demonstrated,22 preliminary evidence from an intensive nurse-based intervention for high-risk elders appears to show great promise in terms of cost reduction23 and better quality of care.24

Multimorbidity, another clinically actionable characteristic, was extremely common in the high-risk groups identified. The mean number of chronic conditions among individuals in the top 1% high-risk groups formed by the clinical PMs ranged from 3 to 4. Compared with the general population, individuals in the high-risk groups had approximately 6 times the burden of chronic disease. This level of multimorbidity suggests that the most effective care management programs will need to address service provision for multiple conditions

The Centers for Medicare & Medicaid Services has several ongoing pilot projects evaluating the impact of care management for Medicare beneficiaries. Our analyses did not include individuals older than age 65 years, although other work using Dx-PM has focused specifically on this age group.21,23,24 The direction and relative magnitude of effects that we observed are likely to apply to older populations, which have higher burdens of chronic disease than the working population in our sample.

The PharMetrics database we used for these analyses has had limited use by researchers. Recognizing this limitation, we performed extensive evaluations of data quality (ie, completeness, valid data ranges, known group differences). These analyses supported the validity of the data source. Furthermore, the predictive accuracy statistics that we found are similar in magnitude to those reported in objective evaluations of alternative PMs.12

Over the last 2 decades there has been an increasing reliance on diagnosis information from claims submitted for hospital and physician payment. Prior research25-27 has indicated that the specificity of diagnoses found in claims is good, even if the sensitivity is moderate. In other words, not all persons with a disease are detected by diagnoses recorded in claims; however, when a diagnosis is present, the chances that the person truly has the disorder are high. More accurate capture of information about patients’ disorders from electronic health records will improve the classification accuracy of diagnostic information, which will improve the predictive performance of PMs. Additional data types from electronic health records such as disease severity, laboratory values, imaging results, social history, and genomics will usher in a new generation of PMs that will further refine the ability of care managers and health professionals to identify proactively those patients with the greatest healthcare needs.

Validation analyses have been performed with pharmacy insurance claims as well. The degree of accuracy of the types of drugs reimbursed by the insurance system is believed to be high based on comparisons to date,28,29 although actual adherence to medication regimens cannot be assessed with retail pharmacy medication claims.

Implications for Care Management Programs

2. Counsell SR, Callahan CM, Clark DO, et al. Geriatric care management for low-income seniors: a randomized controlled trial. JAMA. 2007;298(22):2623-2633.

4. Meenan RT, Goodman MJ, Fishman PA, Hornbrook MC, O’Keefe-Rosetti MC, Bachman DJ. Using risk-adjustment models to identify high-cost risk. Med Care. 2003;41(11):1301-1312.

6. Zhao Y, Ash AS, Haughton J, McMillan B. Identifying future highcost cases through predictive modeling. Dis Manag Health Outcomes. 2003;11(6):389-397.

8. Fishman PA, Goodman MJ, Hornbrook MC, et al. Risk adjustment using automated ambulatory pharmacy data: the RxRisk model. Med Care. 2003;41(1):84-99.

10. Powers CA, Meyer CM, Roebuck MC, Vaziri B. Predictive modeling of total healthcare costs using pharmacy claims data: a comparison of alternative econometric cost modeling techniques. Med Care. 2005;43(11):1065-1072.

12. Winkelman R, Meymud S. A comparative analysis of claims-based tools for health risk assessment. Society of Actuaries sponsored research project, April 20, 2007. http://www.soa.org/research/health/hlth-risk-assement.aspx. Accessed February 27, 2008.

14. Starfield B, Weiner JP, Mumford L, Steinwachs D. Ambulatory care groups: a categorization of diagnoses for research and management. Health Serv Res. 1991;26(1):53-74.

16. Schneeweiss S, Seeger JD, Maclure M, Wang PS, Avorn J, Glynn RJ. Performance of comorbidity scores to control for confounding in epidemiologic studies using claims data. Am J Epidemiol. 2001;154(9):854-864.

18. International Society for Pharmacoeconomics and Outcomes Research. PharMetrics Patient-Centric Database. http://www.ispor.org/DigestOfIntDB/Default.aspx?rcd=426. Accessed February 27, 2008.

20. Weiner JP. Updating and calibrating the Johns Hopkins ACG risk adjustment methods for application to Medicare risk contracting. Johns Hopkins University research contract #500-98-0002 to the Health Care Financing Administration, February 2000.

22. Luck J, Parkerton P, Hagigi F. What is the business case for improving

23. Sylvia ML, Griswold M, Dunbar L, Boyd CM, Park M, Boult C. Guide care: cost and utilization outcomes in a pilot study. Dis Manag. 2008;11(1):29-36.

25. Maclean JR, Fick DM, Hoffman WK, et al. Comparison of 2 years for clinical practice profiling in diabetic care: medical records versus claims and administrative data. Am J Manag Care. 2002;8(2):175-179.

Medicaid claims to medical records: a reliability assessment. Am J Med Qual. 1998;13(2):63-69.

28. Tamblyn R, Lavoie G, Petrella L, Monette J. The use of prescription claims databases in pharmacoepidemiological research: the accuracy and comprehensiveness of the prescription claims database in Quebec. J Clin Epidemiol. 1995;48(8):999-1009.

drug claims in the Ontario drug benefit database. Can J Clin Pharmacol. 2003;10(2):67-71.

Related Videos
Related Content
© 2024 MJH Life Sciences
All rights reserved.