Typical health plan data provide limited information for benchmarking physician performance using even a less stringent rule for attributing patient measures to physicians.
Objective: To evaluate measurement of physician quality performance, which is increasingly used by health plans as the basis of quality improvement, network design, and financial incentives, despite concerns about data and methodological challenges.
Study Design: Evaluation of health plan administrative claims and enrollment data.
Methods: Using administrative data from 9 health plans, we analyzed results for 27 well-accepted quality measures and evaluated how many quality events (patients eligible for a measure) were available per primary care physician and how different approaches for attributing patients to physicians affect the number of quality events per physician.
Results: Fifty-seven percent of primary care physicians had at least 1 patient who was eligible for at least 1 of the selected quality measures. Most physicians had few quality events for any single measure. As an example, for a measure evaluating appropriate treatment for children with upper respiratory tract infections, physicians on average had 14 quality events when care was attributed to physicians if they saw the patient at least once in the measurement year. The mean number of quality events dropped to 9 when attribution required that the physician provide care in at least 50% of a patientâ€™s visits. Few physicians had more than 30 quality events for any given measure.
Conclusions: Available administrative data for a single health plan may provide insufficient information for benchmarking performance for individual physicians. Efforts are needed to develop consensus on assigning measure accountability and to expand information available for each physician, including accessing electronic clinical data, exploring composite measures of performance, and aggregating data across public and private health plans.(Am J Manag Care. 2009;15(1):67-72)
Available administrative data for a typical health plan may provide insufficient information for benchmarking performance among individual physicians.
Measurement of physician quality performance is increasingly used by health plans as the basis for quality improvement, network design, and financial incentives.1 Still, efforts to measure physician performance face a number of challenges, in particular the need for sufficient sample size to support reliable measurement and the lack of consensus on methods for attributing patient measures to clinicians.2,3
Researchers have noted that measurement and comparison of physician quality can be hampered by sample size.4 A minimum threshold of 30 patients is a common guideline for supporting comparisons for an individual measure,5 and evidence suggests that at least 35 to 45 observations are needed to make valid comparisons.6,7 One challenge in obtaining sufficient sample size relates to the measure itself. Many quality measures describe a select group of patients and, by definition, will yield a small number of patients for any physician. Other measures apply to larger proportions of patients, but the ability to capture information on a physician’s entire panel of patients is limited (as when performance measurement relies on data from a single health plan).
A related issue in quality measurement is attribution. Which physicians should be responsible for a quality measure? Given the current focus on team-based chronic disease care and the reality that most patients receive care from multiple clinicians,8 some authors argue that the most appropriate level of accountability is not the individual physician but rather a formal or informal group of physicians.9 Healthcare organizations often attribute patient quality measures based on utilization or a specific set of services, despite the challenges in identifying which physician should be held responsible for the fulfillment (or lack of fulfillment) of a quality measure.
Efforts are needed to understand how these issues may affect the meaningfulness and soundness of physician profiling efforts. In this study, we used a data set that is typical of the information used by health plans to characterize physician performance. Using 27 well-accepted measures that can be obtained from administrative data, we evaluated (1) how many quality events were available per physician and (2) how different attribution rules affect the number of quality events.
eAppendix Table 1
We focused on 27 measures describing acute, chronic, and preventive care activities performed by PCPs. Only measures that could be obtained through administrative claims data were included. (available at www.ajmc.com) lists all quality measures used in this study, as well as the period used to attribute patients and quality events to physicians.
We identified physicians by the unique identifiers used by each health plan. Primary care physicians, including family physicians, general internists, and general pediatricians, were identified based on their specialty designated in health plan credentialing records.
In selecting an attribution approach, we considered the interactions between clinicians and patients in the course of delivering care, the kinds of services involved, the evidence of a physician’s involvement in the patient’s care, and the data sources available. For this study, we applied a measure-specific attribution logic based on administrative data. Measures were attributed to PCPs based on the outpatient visits they provided to patients during a prescribed time frame specific to each measure. Visits were defined using Healthcare Effectiveness Data and Information Set codes for preventive and ambulatory health services.5 To test a less stringent approach to attribution, a patient measure was attributed to a physician if the patient had 1 or more visits during the prescribed time frame. In addition to this “1-visit” rule, 2 more stringent rules were assessed: a PCP was attributed responsibility for a patient’s measure (1) if the patient completed at least 30% of his or her ambulatory visits with that physician (30% rule) and (2) if the patient completed at least 50% of his or her ambulatory visits with that physician (50% rule).
A quality event occurred each time a patient was eligible for a quality measure. Therefore, a single patient could contribute multiple quality events if he or she was eligible for multiple measures (eg, preventive screening and another measure).
Overall, 57% of 170,168 PCPs represented in the study claims data could be attributed responsibility for at least 1 quality event (ie, ≥1 of their patients was eligible for ≥1 of our selected quality measures). Table 1 summarizes findings based on the 1-visit rule and describes the percentage of PCPs with more than 30 quality events for a measure. Except for preventive measures, few PCPs had more than 30 observations for any given measure. However, these high-volume providers account for a larger share of quality events overall, particularly for preventive care measures. For example, only 17% of physicians had more than 30 quality events for colorectal cancer screening, but these physicians accounted for 78% of the quality events for this indicator. Only 1% of physicians had more than 30 quality events for annual glycosylated hemoglobin testing among patients with diabetes mellitus, but they accounted for 16% of the quality events for this measure.
Table 2 summarizes how moving from a less stringent rule to a more stringent rule for attribution affects the number of patients available for characterizing physician performance. For example, using the 1-visit rule for the measure assessing appropriate care for upper respiratory tract infections in children, physicians on average had 14 eligible patients for that measure in the measurement year. The mean number of quality events dropped to 11 when care was attributed using the 30% rule and to 9 when care was attributed using the 50% rule. Relative to the 1-visit rule, the 50% rule reduced by about half the number of quality events per physician for a measure. Adopting a more stringent rule for a measure also reduced the number of PCPs with at least 1 quality event for that measure (data not shown).
Limitations of this study included the number of measures studied, the reliance on administrative data only, and the lack of direct information about the physician’s relationship with the patient. These findings are based on only 27 quality measures from administrative data. However, all are well tested and nationally endorsed, and most are included in health plans’ and employers’ physician performance measurement programs for PCPs. Administrative data were used because most physician-level measurement efforts around the country rely on these data. However, administrative data limit the type of clinical actions that can be profiled.13 These limitations demonstrate the issues that health plans often face in developing meaningful provider profiles. Finally, the study describes findings for PCPs only. Within the context of this study, we found similar challenges in achieving sample sizes that provided more than 30 quality events for specialist physicians.
ImplicationsAs our results demonstrate, several practical steps are needed to ensure that physician profiles based on administrative data have sufficient information for reliable estimates. First, pooling administrative data within communities across all health plans, government purchasers, and other entities is critical to construct a more complete database representing most or all care rendered by that community’s physicians. Regional and national quality initiatives promoted by the Centers for Medicaid & Medicare Services14 and by the Robert Wood Johnson Foundation15 are examples of such data pooling.
Second, composite measures should be considered. Care should be taken in selecting and weighting the individual measures for inclusion in a composite. Furthermore, the use of composites creates additional challenges in interpreting quality results and specific actions for improving care. However, composites constructed around a particular condition or patient care activity may provide insights into quality performance and increase the number of quality events available for comparing providers.
Third, efforts are needed to encourage physician practices, health plans, and other entities to make readily available more clinically detailed data for quality measurement construction. Initially, the more widespread availability of electronic data that is already present in some settings (eg, laboratory results) will allow for a larger number of potentially more meaningful quality measures to be constructed without the expense of medical record review, and efforts are critically needed to improve the capabilities of electronic medical records to report quality measures.16 Efforts to augment routinely available administrative data by including additional codes (eg, Current Procedural Terminology II) that capture critical information about outcomes or results of patient care may be promising but have not yet been tested for accuracy or reliability in widespread applications, to our knowledge.
2. Landon BE, Normand SL, Blumenthal D, Daley J. Physician clinical performance assessment: prospects and barriers. JAMA. 2003;290(9):1183-1189.
4. Hofer TP, Hayward RA, Greenfield S, Wagner EH, Kaplan SH, Manning WG. The unreliability of individual physician â€œreport cardsâ€ for assessing the costs and quality of care of a chronic disease. JAMA. 1999;281(22):2098-2105.
6. Safran DG, Karp M, Coltin K, et al. Measuring patientsâ€™ experiences with individual primary care physicians: results of a statewide demonstration project. J Gen Intern Med. 2007;21(1):13-21.
8. Pham HH, Schrag D, Oâ€™Malley AS, Wu B, Bach PB. Care patterns in Medicare and their implications for pay for performance. N Engl J Med. 2007;356(11):1130-1139.
10. Ingenix.com Web site. Impact Pro. 2008. http://www.ingenix.com/Products/Employers/HealthandProductivity/EvidenceBased
11. Scholle SH, Roski J, Adams J, et al. Reliability of individual and composite measures for profiling physician performance. Am J Manag Care. 2008;14(12):829-838.
13. Paulson LG, Scholle SH, Powers A. A comparison of administrative-only versus administrative plus chart review data. Am J Manag Care. 2007;13(10):553-558.
15. RWJF Web site. The Robert Wood Johnson Foundation: health and health care improvement. http://www.rwjf.org. Accessed October 29, 2008.
17. Casalino LP, Elster A, Eisenberg A, Lewis E, Montgomery J, Ramos D. Will pay-for-performance and quality reporting affect health care disparities? Health Aff (Millwood). 2007;26(3):w405-w414.
19. Epstein AM. Pay for performance at the tipping point. N Engl J Med. 2007;356(5):515-517.