Physicians recognized as high quality by Bridges to Excellence performed better than their peers on claimsbased quality measures and, in some cases, on resource use measures.
Objective: To examine whether physicians who sought and received Bridges to Excellence (BTE) recognition performed better than similar physicians on a standardized set of population-based performance measures.
Study Design: Cross-sectional comparison of performance data.
Methods: Using a claims dataset of all commercially insured members from 6 health plans in Massachusetts, we examined population-based measures of quality and resource use for physicians recognized by the BTE programs Physician Office Link and Diabetes Care Link, compared with nonrecognized physicians in the same specialties. Differences in performance were tested using generalized linear models.
Results: Physician Office LinkÃ¢â‚¬â€œrecognized physicians performed significantly better than their nonrecognized peers on measures of cervical cancer screening, mammography, and glycosylated hemoglobin testing. Diabetes Care LinkÃ¢â‚¬â€œrecognized physicians performed significantly better on all 4 diabetes process measures of quality, with the largest differences observed in microalbumin screening (17.7%). Patients of Physician Office LinkÃ¢â‚¬â€œrecognized physicians had a significantly greater percentage of their resource use accounted for by evaluation and management services (3.4%), and a smaller percentage accounted for by facility (-1.6%), inpatient ancillary (-0.1%), and nonmanagement outpatient services (-1.0%). After adjustment for patient age and sex, and case mix, Physician Office LinkÃ¢â‚¬â€œrecognized physicians had significantly fewer episodes per patient (0.13) and lower resource use per episode ($130), but findings were mixed for Diabetes Care LinkÃ¢â‚¬â€œrecognized physicians.
Conclusions: Our findings suggest that the BTE approach to ascertaining physician quality identifies physicians who perform better on claims-based quality measures and primary care physicians who use a less resource-intensive practice style.
(Am J Manag Care. 2008;14(10):670-677)
Bridges to Excellence (BTE) rewards physicians based on the results of site surveys and retrospective review of clinical data. Physicians recognized as high quality performed better than their peers on claims-based measures of quality and, in some cases, resource use measures.
During the past several years, both public and private payers have adopted pay for performance in a wide variety of contexts with the hope of prompting more evidence-based and higher value patterns of care.1-3 When it was launched in 2002 by a group of large employers collaborating with several health plans and provider organizations, Bridges to Excellence (BTE) was one of the first multistakeholder pay-for-performance programs.4,5 Its programs have relied on certification of performance based on self-reported (subject to audit) medical record and practice systems data. Physicians are invited by BTE to seek certification and begin receiving payments for all patients covered by the program’s sponsors. Incentives for physicians who meet or exceed the performance criteria were initially determined from actuarial analyses that estimated savings to employers from better quality of care6 and are paid by participating employers and health plans directly to physicians, according to a specified bonus schedule that can vary between implementation regions.
Now operated through an independent not-for-profit entity with a board comprised of employer, health plan, and physician representatives, BTE’s model is in operation nationwide through licenses to health plans, employer coalitions, and state agencies.7 Bridges to Excellence was first implemented in Massachusetts in 2003 with 2 major physician reward components: the Physician Office Link (POL) and the Diabetes Care Link (DCL). Attainment of the quality standards set by BTE is associated with both financial rewards and public recognition.
In 2003, approximately 20 physicians in Massachusetts were recognized by BTE under what was then its only program: DCL. By the end of 2006 there were more than 1000 BTE-recognized physicians in Massachusetts across all BTE programs and total payments to these physicians reached $2.4 million. (In 2004 the Cardiac Care Link was introduced. Because that program is newer, we do not examine it in our analysis.) Nationally, 8500 physicians have been recognized and a total of $10 million paid out. Although the BTE model continues to expand, comparative studies of BTE-recognized physicians and their practice patterns are lacking. In this article, we examine the quality, care delivery patterns, and resource use of BTE-recognized physicians compared with nonrecognized physicians in Massachusetts.
On quality measures alone, this comparison is of interest because of BTE’s unusual reliance on voluntary certification through a site survey and chart review that yields data regarding quality measures similar to those used in the Healthcare Effectiveness Data and Information Set (HEDIS). For the POL in particular, which is intended to reward overall population health management, substantial weight is placed on the structural measures of quality derived from the site survey. Thus, it is not clear that physicians with BTE certification would necessarily get better-than-average scores on the quality measures alone. Moreover, there is no comparative aspect to BTE recognition—the standards are set nationally and fixed (ie, they do not reflect relative rankings within a market), so it is unknown whether the physicians seeking certification are those whose performance is above average or simply those with larger numbers of eligible patients (and thus with more to gain). Finally, because BTE certifies physicians for 3 years based on a single retrospective review of 25 sampled charts for the quality measures, we also considered it important to ask whether a contemporaneous comparison between recognized and nonrecognized physicians would show better performance for BTE physicians. Even if recognized physicians performed better than average with the selected patients, these differences might not have persisted.
We examined population-based measures of the quality of care and resource use for Massachusetts physicians participating in BTE’s first 2 rewards programs, POL and DCL. Using an administrative dataset comprised of claims from all commercially insured members of 6 Massachusetts plans, we compared the performance of recognized physicians with the performance of nonrecognized physicians.
Bridges to Excellence Program Design
The POL program focuses on promoting the office practice’s use of systems to enhance the quality of patient care. Today, this assessment is commonly being referred to as a core characteristic of a Medical Home.8 Practices that can demonstrate specific processes and systems of care can earn up to $50 for each patient covered by a participating employer. (The rewards were designed to last 3 years, the period of time during which the recognition is valid.) To obtain the rewards available through the POL program, eligible physicians must pass the National Committee for Quality Assurance (NCQA) Physician Practice Connections assessment program or meet a comparable standard set by the Massachusetts Quality Improvement Organization: MassPRO. The standards for the POL, which are described in Table 1,9 include maintenance of patient registries for the purpose of identifying and following up with at-risk patients, provision of educational resources to patients, and use of electronic systems to maintain patient records, provide decision support, enter orders for prescriptions and lab tests, and provide patient reminders.
The DCL program rewards physicians with $80 per year for each diabetic patient covered by a participating employer. To obtain the rewards, eligible physicians must demonstrate that they provide high levels of diabetes care by passing NCQA’s Diabetes Physician Recognition Program, which was developed in collaboration with the American Diabetes Association.10 To qualify for the 3-year recognition, physicians submit medical record data on glycosylated hemoglobin (A1C), blood pressure, and lipid testing, as well as data on eye, foot, and nephropathy exams, for a random sample of 25 diabetes patients. (The 25 charts are selected retrospectively for patients who have been in the practice’s care for at least 1 year. A date is selected at random from a set window of time, and the next 25 charts for patients with a diagnosis for diabetes are pulled for analysis. Both the sampling method and the actual charts are reviewed and subject to NCQA audit.) The NCQA assesses the data submitted by the physician and attributes points based on the level of disease management demonstrated by the physician. All information submitted for POL and DCL certification is subject to audit.
We obtained data from BTE on the identities of physicians recognized for the POL and DCL programs between 2003 and 2006. Administrative data, including claims and enrollment files, were obtained from the Massachusetts Group Insurance Commission (GIC), which has created a database that includes all claims from all commercially insured members (not just those for GIC members) from the 6 health plans that serve the GIC (Harvard Pilgrim Healthcare, Tufts Health Plan, Unicare, Fallon, Neighborhood Health Plan, and Health New England). The GIC’s data cover approximately 50% of privately insured residents and more than 7000 physicians in Massachusetts including all but 2 BTE-recognized physicians. Physician-level measures of quality and resource use per episode were computed by Mercer Human Resources Consulting using profiling software licensed from Resolution Health, Inc, and Ingenix, respectively.
Quality and Cost Measures
To ascertain differences between recognized and nonrecognized physicians, we examined claims-based quality measures and average resource use per standardized episode for several categories of services and the total. (Episodes were created using Symmetry’s Episode Treatment Groupers, a unit of Ingenix. These episodes are defined using proprietary algorithms that identify in claims data clinical events that trigger, break, and end an episode. Similarly, algorithms based on clinical logic are used to group all claims related to that episode. For a more complete description, see reference 11.) For the POL cohort, we examined a set of measures typically used for examining primary care quality, including cervical cancer screening, mammography, A1C testing, cholesterol screening for individuals with coronary heart disease, and cholesterol screening for individuals with hypertension. For the DCL cohort, we examined 4 widely used claims-based measures of diabetes care quality. These were A1C testing, cholesterol testing, microalbumin testing, and diabetic retinal exams. Measure specifications were adapted by Resolution Health, Inc, from HEDIS and other national evidence-based guidelines to conform to the provider profiling context. The most recent 18 months of outpatient claims and pharmacy data were used for evaluation and calculation of each quality score (eg, A1C testing rate).
Eligible patients were attributed to multiple physicians based on the following steps. In the first step of the analysis, a member’s claims were evaluated to determine which physicians had the greatest number of claims. In the second step, the rules (eg, diabetic patients should have A1C testing at least annually) that are applicable to an individual member are then assigned to all the physicians identified in step 1 who could be responsible for giving that type of care based on their specialty (eg, internal medicine or endocrinology could be responsible for diabetes care, but dermatology could not). The quality score for a physician was then calculated by determining what percentage of the patients for whom the physician was deemed “responsible” through this process met the criteria for the rule (eg, annual A1C testing).
The second set of measures we examined includes standardized measures of spending that we refer to as resource use (we used Medicare rates rather than the actual transaction prices, which are unique to each payer-provider pair). We decomposed resource use in 2 ways. First, we use the Episode Treatment Grouper software from Ingenix to tally resource use in 6 categories: evaluation and management, surgery (claims submitted by physicians for surgical interventions), facility (room and board only), inpatient ancillary (inpatient claims for services other than room and board), outpatient (claims for services provided in the outpatient setting other than evaluation and management), and prescription drugs.11 Eligible patients were attributed to physicians who were responsible for the majority of the clinician fees in the episode. We compared average percentages of total patient resource use in each category for recognized and nonrecognized physicians. Then, we examined the number of episodes per patient and the total resource use per episode.
We compared average performance for physicians recognized by the 2 BTE programs with the performance of similar physicians who were not recognized. All physicians who were eligible for BTE rewards in Massachusetts were included in the comparison groups. The initial list was refined by excluding physicians who did not have at least 200 episodes and did not have enough data to receive a quality score. Additionally, comparison groups were selected from the same specialties as those of the recognized physicians: internal medicine and family practice for the POL and those 2 specialties plus endocrinology for the DCL. For the POL analysis, the comparison group was further restricted by excluding physicians who were outliers in their resource use (ie, whose average resource use per episode of care was more than 6 standard deviations above the mean resource use for all POL physicians). Analyses were conducted using the physician as the unit of analysis. The statistical significance of differences in average performance for each measure of resource use and quality was determined by using generalized linear models in which the explanatory variables were specialty, mean patient age, the percentage of patients who were female, and a case mix indicator, which is a measure of the resource intensiveness of the Episode Treatment Groupers of a physician’s entire patient population compared with Episode Treatment Groupers for other physicians in the same specialty. We report predicted values from these regressions for both sets of physicians and used a bootstrapping (ie, resampling) method12 to compute 95% confidence intervals (CIs) for the differences.
We identified 405 physicians who were recognized by BTE’s POL program between 2003 and 2006 and 3916 nonrecognized physicians in the GIC’s data (Table 2). Physicians recognized through the POL were more likely to be specialists in internal medicine than nonrecognized physicians and were less likely to be family practitioners. Patient sex was similar between POLrecognized and nonrecognized physicians, but patients were younger and case mix was substantially higher for nonrecognized physicians. Between 2003 and 2006, BTE’s DCL program recognized 91 Massachusetts physicians; GIC’s data showed 1204 nonrecognized physicians treating diabetic patients during the same time period. Physicians recognized through the DCL were much more likely to be specialists in endocrinology than nonrecognized physicians and much less likely to be family practitioners. Patient age and sex were similar between DCL-recognized and nonrecognized physicians, but case mix was substantially lower for nonrecognized physicians. Comparison of Quality Measures On all process measures of quality, recognized physicians performed better than nonrecognized physicians (Table 3). Differences were statistically significant for cervical cancer screening, mammography, and A1C testing for the comparisons of POL-recognized physicians with nonrecognized physicians. The DCL-recognized physicians performed statistically significantly better on all 4 diabetes process measures of quality, with the largest differences observed in microalbumin screening (17.7%; 95% CI = 14.0%, 21.4%). This second finding seems to affirm that the NCQA’s Diabetes Physician Recognition Program concurs with claims-based measures of high-quality care for patients with diabetes.
Comparison of Practice Patterns and Resource Use Measures
Differences in practice patterns between physicians recognized by the POL program and the comparison group shared a common pattern but had mixed degrees of statistical significance (Table 4). In particular, patients of POL-recognized physicians had a significantly greater percentage of their resource use accounted for by management services (3.4%; 95% CI = 2.4%, 4.4%) and a smaller percentage accounted for by facility (−1.6%; 95% CI = −3.4%, −0.02%), inpatient ancillary services (−0.1%; 95% CI = −0.1%, −0.03%), and nonmanagement outpatient services (−0.1%; 95% CI = −1.7%, −0.3%). Patients of POL-recognized physicians also had a lower percentage of their total resource use accounted for by prescription drugs (−1.6%; 95% CI = −2.8%, −0.3%). The distribution of expenditures for DCL-recognized physicians compared with nonrecognized physicians treating diabetic patients were more similar, with smaller, but significant differences only in the share of standardized resource use accounted for by management (2.0%; 95% CI = 0.5%, 3.6%) and surgery (−0.3%; 95% CI = −0.5%, −0.1%).
After adjustment for specialty, patient age and sex, and case mix, POL-recognized physicians had significantly fewer episodes per patient (0.13; 95% CI = 0.13, 0.15) and lower resource use per episode ($130; 95% CI = $119, $140) (Table 5). For diabetes care, however, DCL-recognized physicians had more episodes per patient (0.17; 95% CI = 0.14, 0.20 for primary care physicians; and 0.08; 95% CI = 0.03, 0.13 for endocrinologists). Resource use per episode was lower for DCLrecognized primary care physicians ($26; 95% CI = $5, $47) and higher, but not significantly so, for DCL-recognized endocrinologists ($137; 95% CI = −$45, $320).
Bridges to Excellence is a national pay-for-performance program that relies on chart review and structural survey data to assess the quality of care in physician practices.5,13 Although analyses conducted during the design phase of BTE suggested that recognized physicians would not only deliver higher quality care but also would do so at lower total cost, ours is the first formal analysis comparing the practice patterns of recognized physicians with those of their peers using a standardized methodology that relied on a large claims database. Our findings suggest that recognition by BTE is associated with systematically better performance on claims-based measures of quality. Moren over, recognized physicians appear to use a different style of practice than their peers, emphasizing patient management over ancillary services and procedures. Patterns of relative resource use for recognized versus nonrecognized physicians were more varied. Possibly because of better management, primary care physicians recognized by the POL had nearly 20% lower resource use per episode with approximately 5% fewer episodes per patient. The DCL-recognized primary care physicians had slightly lower resource use per episode for their patients, but DCL-recognized endocrinologists did not. Both primary care physicians and endocrinologists recognized by the DCL, however, had more episodes per patient than nonrecognized physicians in their respective specialties.
Bridges to Excellence offers a potential complement to claims-data profiling of physicians for identifying and rewarding high-quality care. The advantages of this measurement approach derive primarily from its ability to capture clinical information and the associated greater credibility with physicians. Our analysis inadvertently highlighted another advantage compared with claims-based profiling methods as well. Even though the GIC’s data cover 6 major payers, for some quality measures a segment of physicians could not be profiled because of an insufficient number of patients in the denominator (even using a relatively low cut-off of 10 patients per measure). By design, however, the BTE chart review approach can be applied to virtually all practices because it considers the universe of a practice’s patients. Making “all-payer” data truly include all payers—including Medicare and Medicaid— and standardizing physician identifiers to maximize linkages among payers would reduce this advantage and dramatically increase the usefulness of claims data for profiling physicians, however.
The primary limitations of the BTE model for profiling physicians are twofold. First, because assessments are voluntary, BTE may only get the attention of already high-performing physicians and do little to encourage improvement among the least prepared. Second, the collection and auditing of both the clinical and site survey data are expensive, particularly for small practices, relative to analysis of existing administrative data.
The association between POL recognition and lower resource use is consistent with findings from the actuarial model underpinning BTE. Each BTE program had been originally subjected to an actuarial analysis to determine ex ante whether cost savings would be realized by recognized physicians, and it is from these analyses that financial reward targets also were established (eg, $80 yearly bonus per diabetic patient). Although we are unable to say at this point whether the relationship between recognition status and lower resource use is a causal one (such that more recognition will increase cost-efficiency), the knowledge of this association is important for payers and consumers who are seeking ways to increase the value obtained in the healthcare system. In particular, moving patients to these physicians might increase value, although this result may be limited to the extent that unmeasured heterogeneity of patients treated by recognized versus nonrecognized physicians is affecting the results. If nonrecognized physicians responded to such a loss of market share by adopting similar practice patterns and improving the quality and efficiency of their practice, the value of care delivered throughout the healthcare system would improve.
Our findings should be viewed in light of several limitations. First, our analysis is cross-sectional in nature, because time-series data were not available. Thus, we cannot make any claims about causal relationships between the BTE recognition program and quality or between quality and resource use. Second, because chartreview data were available only for recognized physicians, we did not have access to clinical data in our analysis for risk adjustment. Although our comparisons were adjusted for patient age and sex, and an indicator of case mix, residual patient heterogeneity correlated with recognition status may be confounding our findings. This is a particular issue for the resource use comparisons. Finally, we chose to examine Massachusetts physicians because of the availability of a large multipayer database, but our results may not generalize to markets where BTE might be launched later. Factors such as overall performance on quality measures, the organization of practice, and the intensity of use, all of which vary regionally, may be important mediators of the relationships we describe.
Although the evidence of impact is mixed,14-17 pay-forperformance programs like BTE have engaged payers and providers in an important dialogue about how to improve the quality and cost-efficiency of care. Our analysis demonstrated a positive relationship between BTE recognition and population-based measures of quality. It remains unknown, however, whether the program motivated physicians to reorganize their practices or adopt new treatment patterns. Although providing a means for high-performing physicians to identify themselves to payers and patients may be a worthwhile goal in itself, it also is critical to understand whether BTE and similar programs create a business case for poor performers to change. Diffusion of BTE’s program in markets around the country should create opportunities to address this question.
Author Affiliations: From the Department of Health Policy and Management, Harvard School of Public Health (MBR, ADS), Boston, MA; Bridges to Excellence (FSD), Newtown, CT; Mercer Health Benefits (MF), Cleveland, OH; Mercer Health Benefits (RDR), Norwalk, CT; and Health Data Management Solutions (SY), Shaker Heights, OH.
Funding Source: This work was supported by general institutional funds.
Author Disclosure: The authors (MBR, ADS, MF, RDR, SY) report no relationship or financial interest with any entity that would pose a conflict of interest with the subject matter of this article. Mr de Brantes is the CEO of Bridges to Excellence, the organization whose pay-for-performance programs are discussed in this study.
Authorship Information: Concept and design (MBR, FSD, MF); acquisition of data (FSD, MF); analysis and interpretation of data (MBR , ADS, MF, RDR, SY); drafting of the manuscript (MBR, FSD, ADS, RDR); critical revision of the manuscript for important intellectual content (MBR , FSD, RDR); statistical analysis (MBR, ADS); administrative, technical, or logistic support (MBR); and supervision (MBR).
Address correspondence to: Meredith B. Rosenthal, PhD, Department of Health Policy and Management, Harvard School of Public Health, 677 Huntington Ave, Boston, MA 02115. E-mail: email@example.com.
1. Kuhmerker K, Hartman T. Pay-for-Performance in State Medicaid Programs: A Survey of State Medicaid Directors and Programs. New York: The Commonwealth Fund; 2007.
2. Trude S, Au M, Christianson JB. Health plan pay-for-performance strategies. Am J Manag Care. 2006;12(9):537-542.
3. Rosenthal MB, Landon BE, Normand SL, Frank RG, Epstein AM. Paying for pay-for-performance in commercial HMOs. N Engl J Med. 2006;355(18):1895-1902.
4. Epstein AM, Lee TH, Hamel MB. Paying physicians for high-quality care. N Engl J Med. 2004;350(4):406-410.
5. Rosenthal MB, Fernandopulle R, Song HR, Landon B. Paying for quality: providersÃ¢â‚¬â„¢ incentives for quality improvement. Health Aff. 2004;23(2):127-141.
6. Bridges to Excellence. BTE research and analysis. http://www.bridgestoexcellence.org/Content/ContentDisplay.aspx?ContentID=81. Accessed March 31, 2008.
7. Bridges to Excellence Web site. http://www.bridgestoexcellence.org. Accessed November 30, 2007.
8. Robert Graham Center. The Patient Centered Medical Home. History, Seven Core Features, Evidence and Transformational Change. November 2007. http://www.adfammed.org/documents/grahmcentermedicalhome.pdf. Accessed March 31, 2008.
9. National Committee for Quality Assurance. Physician Practice Connections. http://www.ncqa.org/tabid/141/Default.aspx. Accessed May 8, 2008.
10. National Committee for Quality Assurance. Diabetes Physician Recognition Program. http://www.ncqa.org/tabid/139/Default.aspx. Accessed March 31, 2008.
11. Ingenix Inc. Symmetry Episode Treatment Groups. A Condition Classification and Episode Building System. White Paper. 2006. http://www.ingenix.com/content/attachments/ETG%206.0%20White%20Paper_01-17-07.pdf. Accessed May 7, 2008.
12. Hall P, Horowitz JL. Bootstrap critical values for tests based on generalized-method-of-moments estimators. Econometrica. 1996;64(4):891-916.
13. The Leapfrog Group Web site. http://www.leapfroggroup.org/home. Accessed November 30, 2007.
14. Petersen LA, Woodard LD, Urech T, Daw C, Sookanan S. Does pay-for-performance improve the quality of health care? Ann Intern Med. 2006;145(4):265-272.
15. Lindenauer PK, Remus D, Roman S, et al. Public reporting and pay for performance in hospital quality improvement. N Engl J Med. 2007;356(5):488-496.
16. Epstein AM. Paying for performance at the tipping point. N Engl J Med. 2007;356(5):515-517.
17. Young GJ, Meterko M, Beckman H, et al. Effects of paying physicians based on their relative performance for quality. J Gen Intern Med. 2007;22(6):872-876.