Currently Viewing:
The American Journal of Managed Care July 2017
The Price May Not Be Right: The Value of Comparison Shopping for Prescription Drugs
Sanjay Arora, MD; Neeraj Sood, PhD; Sophie Terp, MD; and Geoffrey Joyce, PhD
US Internists' Awareness and Use of Overtreatment Guidelines: A National Survey
Kira L. Ryskina, MD, MS; Eric S. Holmboe, MD; Elizabeth Bernabeo, MPH; Rachel M. Werner, MD, PhD; Judy A. Shea, PhD; and Judith A. Long, MD
Cost-Effectiveness of a Patient Navigation Program to Improve Cervical Cancer Screening
Yan Li, PhD; Erin Carlson, DrPH; Roberto Villarreal, MD; Leah Meraz, MA; and José A. Pagán, PhD
The Association Between Insurance Type and Cost-Related Delay in Care: A Survey
Sora Al Rowas, MD, MSc; Michael B. Rothberg, MD, MPH; Benjamin Johnson, MD; Joel Miller, MD, MPH; Mohanad AlMahmoud, MD; Jennifer Friderici, MS; Sarah L. Goff, MD; and Tara Lagu, MD, MPH
Availability and Variation of Publicly Reported Prescription Drug Prices
Jeffrey T. Kullgren, MD, MS, MPH; Joel E. Segel, PhD; Timothy A. Peterson, MD, MBA; A. Mark Fendrick, MD; and Simone Singh, PhD
Twitter Accounts Followed by Congressional Health Staff
David Grande, MD, MPA; Zachary F. Meisel, MD, MS; Raina M. Merchant, MD, MS; Jane Seymour, MPH; and Sarah E. Gollust, PhD
Currently Reading
Predicting High-Cost Privately Insured Patients Based on Self-Reported Health and Utilization Data
Peter J. Cunningham, PhD
Medication Adherence and Improved Outcomes Among Patients With Type 2 Diabetes
Sarah E. Curtis, MPH; Kristina S. Boye, PhD; Maureen J. Lage, PhD; and Luis-Emilio Garcia-Perez, MD, PhD
Leveraging EHRs for Patient Engagement: Perspectives on Tailored Program Outreach
Susan D. Brown, PhD; Christina S. Grijalva, MA; and Assiamira Ferrara, MD, PhD

Predicting High-Cost Privately Insured Patients Based on Self-Reported Health and Utilization Data

Peter J. Cunningham, PhD
The results of this study show that patient-reported data on health and healthcare can be useful in predicting high-cost patients when claims data for prior years are not available.
About one-fifth of autoworkers reported little or no physical activity in a typical week, 23.2% were smokers, and 42% were classified as obese. The percentage with high costs in 2013 was higher among those who had no physical activity (compared with individuals who were physically active) and were smokers (compared with nonsmokers), and obese persons (compared with the nonobese workers).

Most autoworkers (91.3%) had a usual source of care, 12.5% reported an inpatient stay in 2012, and 7.4% reported 2 or more visits to the ED in 2012. Autoworkers with self-reported inpatient and ED use were much more likely to have high healthcare costs in 2013 compared with those with no self-reported hospital use.

Models Predicting High-Cost Patients

Table 2 compares the performances of different models predicting high-cost patients for 2013. Model 1 includes 2012 expenditures, the comorbidity index, age, gender, and race/ethnicity. This model had a C statistic of 0.78, a pseudo R2 of 17.8%, and a discrimination slope of .209. Models 2 to 5 include only patient-reported information from the surveys. Model 2 includes patient-reported demographics, education, and income; model 3 adds self-reported chronic conditions; model 4 adds health status and health behavior measures; and model 5 adds self-reported inpatient and ED use. The results for the C statistics, R2, and discrimination slope are consistent in that they show: 1) adding self-reported health, health behaviors, and utilization (models 3-5) substantially improves predictions of high healthcare costs compared with the model that includes only demographics and socioeconomic status, and 2) models based on survey measures have high predictive power (model 5, C statistics = 0.73), but not quite as high as the model that includes claims-based measures and demographic characteristics (model 1, C statistic = 0.78).

Table 3 shows measures of sensitivity, specificity, PPV, and NPV for the 3 best-performing models (models 1, 5, and 6), computed at both the 50th and 75th percent risk thresholds. The most noteworthy finding from these results is that measures of sensitivity (ie, the percentage of individuals who had high costs in 2013 who were accurately predicted by the models) are low, relative to similar studies of the Medicaid population.12,13 In fact, few high-cost cases for 2013 were predicted accurately based on the 75th percentile threshold. Models that include claims data (models 1 and 6) perform better on sensitivity compared with the model that includes only survey data (model 5). The model that includes both claims-based and survey variables performed the best on sensitivity at the 75th percentile risk threshold.

Importance of Individual Self-Reported Health Measures in Predicting High-Cost Patients

Table 4 shows the marginal probabilities computed from the model with survey-only variables (model 5). The probability of being a high-cost patient in 2013 was significantly higher among older (relative to younger) individuals, as well as among those with diabetes, COPD, arthritis, depression, and cancer; among those in good, fair, or poor self-reported health (compared with excellent or very good); and among those with work limitations due to health. Those with self-reported hospital stays and ED visits in 2012 were more likely to have high costs in 2013 compared with those with no hospital use.


The results of this analysis show that self-reported information on health, health behaviors, and healthcare use commonly obtained through HNAs or HRAs is a reasonably good predictor of future healthcare costs for a privately insured population in the absence of claims or EHRs. Although the models with only self-reported measures do not perform quite as well as models that include claims-based information on spending and morbidity, the results are similar to studies that examined the usefulness of self-reported measures in predicting high-cost patients among Medicaid beneficiaries.12,13

Although error in patient-reported data is a longstanding concern, one advantage is that it is less susceptible than claims or EHR data to “upcoding,” or the tendency by some plans and providers to aggressively code patient diagnoses to make patients appear sicker in order to maximize payment. A recent study found that risk-adjustment scores based on claims data were significantly higher for enrollees in Medicare Advantage health plans—which are compensated by the federal government and partially based on risk scores—than they would be if the enrollees were in fee-for-service plans.17 Similar risk-adjustment methods are used in the ACA’s federal and state marketplaces: higher rates are paid to plans with sicker enrollees, funded in part through lower rates paid to plans with healthier enrollees. Self-reported health information from patients or plan enrollees is less susceptible to such bias.

Nevertheless, there are no perfect predictors of which patients will be high-cost in the future. Only about half of the study sample who had high healthcare costs in 2012 also had high healthcare costs in 2013, based on the definition used in this study. This may reflect greater variability among the sample for the study compared with other studies, either due to the relatively small sample size (n = 3983) or because the 25% of the study sample with high healthcare costs was more heterogeneous than a similarly defined group for the Medicaid population. Regardless, the best measures in this study used to predict high-cost patients still have a relatively high rate of error.

The success of innovative care delivery models that focus on care management for high-cost patients depends, in part, on whether the additional resources needed for more intensive care management results in greater cost savings in the long run by preventing unnecessary or avoidable utilization. The key to this success is the efficient targeting of patients who will incur high healthcare costs unless diverted into care management programs. If such targeting includes a large number of patients who will not incur high costs even without the intervention, the effectiveness of care management practices in reducing healthcare costs may be greatly diminished. The high scores for specificity in this study (ie, the proportion of non–high-cost cases in 2013 that were accurately identified as such) suggest the models estimated in this study would be relatively successful in preventing costly case management or other specialized services to patients who would not benefit from them. On the other hand, the relatively low sensitivity scores suggest that a large percentage of patients who would potentially benefit from these services may not be selected to receive them and therefore would be at higher risk for incurring higher costs.


There are several limitations to this analysis that should be noted. First, the sample is limited to US autoworkers and therefore may not be generalizable to other privately insured populations. Predicting high costs for the autoworker population, which tends to be older and have a high prevalence of chronic conditions, may be quite different than for a younger and healthier population. In addition, the small sample size, compared with those in other studies that examined self-reported measures, may lead to lower precision in the predictive ability of self-reported measures than if a larger sample had been available. Also, the results may differ for specific conditions, which is important because many disease management programs are designed to improve quality of care and lower costs for specific diseases (eg, diabetes) rather than high-risk patients in general.


Information from health needs assessments or health risk appraisals are increasingly used for a variety of purposes to improve delivery of care, but little is known as to how effective they could be in targeting privately insured patients who are likely to incur high healthcare costs. The results from this study indicate that self-reported information on health conditions, health status, and healthcare use can be useful in predicting high healthcare costs when prior year claims or medical records are not available.


The following individuals reviewed an earlier version of this draft and provided comments. Written permission has been obtained from all individuals for including them in the acknowledgments. None of these individuals received any compensation for reviewing the manuscript: Paul Ginsburg, PhD, professor of Public Policy, University of Southern California;

Alwyn Cassil, principal, Policy Translation, LLC. In addition, Joel Smith of Mathematica Policy Research, Inc, Washington, DC, provided the programming for statistical analysis. 

Author Affiliations: Department of Health Behavior and Policy, School of Medicine, Virginia Commonwealth University (PJC), Richmond, VA.

Source of Funding: National Institute for Health Care Reform.

Author Disclosures: The authors report no relationship or financial interest with any entity that would pose a conflict of interest with the subject matter of this article.

Authorship Information: Concept and design; acquisition of data; analysis and interpretation of data; drafting of the manuscript; critical revision of the manuscript for important intellectual content; statistical analysis; provision of patients or study materials; obtaining funding; administrative, technical, or logistic support; and supervision.

Address Correspondence to: Peter J. Cunningham, PhD, School of Medicine, Virginia Commonwealth University, 830 E Main St, 4th Fl, Richmond, VA 23298-0430. E-mail: 

1. Cucciare MA, O’Donohue W. Predicting future healthcare costs: how well does risk-adjustment work? J Health Organ Manag. 2006;20(2-3):150-162.

2. Ash AS, Zhao Y, Ellis RP, Schlein Kramer M. Finding future high-cost cases: comparing prior cost versus diagnosis-based methods. Health Serv Res. 2001;36(6, pt 2):194-206.

3. Levine SH, Adams J, Attaway K, et al. Predicting the financial risks of seriously ill patients. California HealthCare Foundation website. Published December 2011. Accessed March 4, 2016.

4. Cunningham PJ. Few Americans switch employer health plans for better quality, lower costs. National Institute for Health Care Reform website. Published January 2013. Accessed July 25, 2016.

5. Sung I. How is health reform impacting insurance switching patterns? The Health Care Blog website. Published July 17, 2015. Accessed July 23, 2016.

6. Lafata JE, Shay LA, Brown R, Street RL. Office-based tools and primary care visit communication, length, and preventive service delivery. Health Serv Res. 2016;51(2):728-745. doi: 10.1111/1475-6773.12348. 

7. Leininger L, Avery K. The capacity of self-reported health measures to predict high-need Medicaid enrollees. State Health Access Data Assistance Center website. Published February 2015. Accessed March 4, 2016.

8. Perrin NA, Stiefel M, Mosen DM, Bauck A, Shuster E, Dirks EM. Self-reported health and functional status information improves prediction of inpatient admissions and costs. Am J Manag Care. 2011;17(12):e472-e478.

9. Fleishman JA, Cohen JW, Manning WG, Kosinski M. Using the SF-12 health status measure to improve predictions of medical expenditures. Med Care. 2006:44(suppl 5):I54-I63.

10. Fleishman JA, Cohen JW. Using information on clinical conditions to predict high-cost patients. Health Serv Res. 2010;45(2):532-552. doi: 10.1111/j.1475-6773.2009.01080.x.

11. DeSalvo KB, Jones TM, Peabody J, et al. Health care expenditure prediction with a single item, self-rated health measure. Med Care. 2009;47(4):440-447. doi: 10.1097/MLR.0b013e318190b716.

12. Leininger LJ, Friedsam D, Voskuil K, DeLiere T. Predicting high-need cases among new Medicaid enrollees. Am J Manag Care. 2014;20(9):e399-e407.

13. Wherry LR, Burns ME, Leininger LJ. Using self-reported health measures to predict high-need cases among Medicaid-eligible adults. Health Serv Res. 2014;49(suppl 2):2147-2172. doi: 10.1111/1475-6773.12222.

14. Charlson ME, Pompei P, Ales KL, MacKenzie CR. A new method of classifying prognostic comorbidity in longitudinal studies: development and validation. J Chronic Dis. 1987;40(5):373-383.

15. Pencina MJ, D’Agostino RB Sr. Evaluating discrimination of risk prediction models: the C statistic. JAMA. 2015;314(10):1063-1064. doi: 10.1001/jama.2015.11082.

16. Cox DR, Snell EJ. The Analysis of Binary Data. 2nd ed. London: Chapman and Hall; 1989.

17. Geruso M, Layton T. Upcoding: evidence from Medicare on squishy risk adjustment. National Bureau of Economic Research website. Published May 2015. Accessed March 21, 2016.  
Copyright AJMC 2006-2019 Clinical Care Targeted Communications Group, LLC. All Rights Reserved.
Welcome the the new and improved, the premier managed market network. Tell us about yourself so that we can serve you better.
Sign Up