Some hospitals were able to outperform others in a commercial insurer episode-based incentive program, but there was little evidence of global reductions in episode spending.
Objectives: To evaluate hospital performance and behaviors in the first 2 years of a statewide commercial insurance episode-based incentive pay-for-performance (P4P) program.
Study Design: Retrospective cohort study of price- and risk-standardized episode-of-care spending from the Michigan Value Collaborative claims data registry.
Methods: Changes in hospital-level episode spending between baseline and performance years were estimated during the program years (PYs) 2018 and 2019. The distribution and hospital characteristics associated with P4P points earned were described for both PYs. A difference-in-differences (DID) analysis compared changes in patient-level episode spending associated with program implementation.
Results: Hospital-level episode spending for all conditions declined significantly from the baseline year to the performance year in PY 2018 (–$671; 95% CI, –$1113 to –$230) but was not significantly different for PY 2019 ($177; 95% CI, –$412 to $767). Hospitals earned a mean (SD) total of 6.3 (3.1) of 10 points in PY 2018 and 4.5 (2.9) of 10 points in PY 2019, with few significant differences in P4P points across hospital characteristics. The highest-scoring hospitals were more likely to have changes in case mix index and decreases in spending across the entire episode of care compared with the lowest-scoring hospitals. DID analysis revealed no significant changes in patient-level episode spending associated with program implementation.
Conclusions: There was little evidence for overall reductions in spending associated with the program, but the performance of the hospitals that achieved greatest savings and incentives provides insights into the ongoing design of hospital P4P metrics.
Am J Manag Care. 2023;29(8):e250-e256. https://doi.org/10.37765/ajmc.2023.89412
How do hospitals perform in a statewide commercial payer episode-based incentive program?
Limited incentives for hospitals to manage care after discharge have led to large, unwarranted variations in spending and quality around hospitalization.1,2 Medicare has sought to contain excess spending through bundled payment programs, making hospitals financially accountable for complete episodes of care, including index hospitalization and downstream health care utilization. Early evaluations of these programs suggest that hospitals can achieve savings, primarily in postacute care reductions.3-6 There is considerable concern, however, that hospitals will achieve savings through gaming, such as avoiding high-cost admissions, modifying payer mix, or upcoding comorbidities.
Following this lead, commercial insurers are increasingly introducing bundled payments, but without a similar degree of independent evaluation.7,8 Despite marked differences in spending around hospitalization between Medicare and commercial insurance, little is known about hospitals’ approaches to episode-based reimbursement in commercial markets.9-11 As a component of its pay-for-performance (P4P) programs in hospital contracts, Blue Cross Blue Shield of Michigan (BCBSM) introduced a metric focused on episode-of-care spending for select surgical and medical admissions. Similar to Medicare’s Hospital Value-Based Purchasing Program, hospitals earned points by either (1) reducing hospital mean episode spending in the performance year compared with a baseline year (ie, improvement) or (2) having lower episode spending compared with peer hospitals in the performance year (ie, achievement).12 Hospital peer groups were created to facilitate peer group comparisons and were determined according to bed size, case mix index, and teaching status.
The objective of this study was to evaluate hospital performance and behaviors in the first 2 years of this episode-spending incentive program, as well as the effectiveness of the program at reducing health care utilization overall. Hospital performance in the program was evaluated through changes in hospital-level 30-day episode spending, program points earned, and potential sources of savings for high-performing hospitals compared with low-performing hospitals. The overall effectiveness of participating in the program on patient-level 30-day episode spending was evaluated using a patient-level difference-in-differences (DID) approach.
The BCBSM P4P Program
The BCBSM Hospital P4P program provides short-term acute care hospitals with financial incentives totaling $200 million per year based on weighted performance in 4 domains: collaborative quality initiative participation and performance (40% of the program), all-cause readmissions (30%), participation in a health information exchange (20%), and performance in episode-of-care spending measures (10%). The episode-of-care spending component of the P4P program is run by the Michigan Value Collaborative (MVC), one of the BCBSM-funded collaborative quality initiatives.13 The MVC is a statewide collaborative quality initiative that includes 100 acute care hospitals and 40 physician organizations in Michigan.
Participation in the episode spending component required hospitals to choose 2 condition-based episode spending measures from a possible 7 conditions: congestive heart failure (CHF), pneumonia, acute myocardial infarction (AMI), coronary artery bypass grafting (CABG), joint arthroplasty, spine surgery, and colectomy. Condition-based episodes of care were eligible for inclusion in the measure if the beneficiaries were insured by commercial BCBSM preferred provider organization or Medicare fee-for-service (FFS) plans. In 2016, hospitals selected 2 condition-based episodes for participation in the first 2 years of the P4P program. Program year (PY) 2018 compared episode spending in 2017 (performance year) vs 2015 (baseline year) and PY 2019 compared spending in 2018 vs 2016. Hospitals were provided with information on their baseline episode spending for all eligible condition-based episodes, including their current mean episode spending, the collaborative-wide mean episode spending, and how their hospital ranked relative to other hospitals in the collaborative. The year between the baseline and performance years was designed to allow complete data availability for baseline years before the beginning of each performance year and provide time for hospitals to implement improvement initiatives before and during the performance period.
Hospitals earned 0 to 5 points for each selected condition-based episode (eg, CHF, CABG), for a total of 10 points. The number of points earned was determined by comparing hospital mean 30-day episode spending during the performance year against its baseline year. Once the number of points a hospital earns was determined, scores were provided to BCBSM for incorporation into overall hospital P4P program scoring, and financial incentive payments were subsequently disbursed by BCBSM. Further description of the P4P program can be found in the eAppendix (available at ajmc.com).
Data and Study Population
Data from the MVC administrative claims data registry, which contains facility and professional claims data for all Michigan residents insured by BCBSM and Medicare FFS, were used for this study. Administrative claims are arranged into episodes of care for specific hospitalizations and include all services provided during the index hospitalization triggering the episode of care and within 30 days of discharge. The MVC registry includes condition-based episodes of care for medical and surgical admissions, defined according to International Classification of Diseases, Ninth Revision and Tenth Revision diagnostic and procedural codes from facility claims and Current Procedural Terminology codes from professional claims.14 Details on the construction and definitions of MVC registry episodes of care can be found in the eAppendix. The study population included all condition-based 30-day episodes of care for CHF, pneumonia, AMI, CABG, joint arthroplasty, spine surgery, and colectomy. Episodes of care for these conditions were included in the analysis if the admission dates were between January 1, 2015, and December 31, 2018.
The primary outcome in this study was price-standardized, risk-adjusted 30-day episode spending. Hospital-level mean episode spending was used to evaluate hospital performance in the P4P program. Patient-level episode spending was used in the DID approach to evaluate the overall effectiveness of the program in reducing spending. Price standardization was achieved using previously established methods and the average Medicare fee schedule from all years.15,16 Price standardization was used to avoid differences in health care utilization among hospitals being attributed to differences in contractual amounts between payers and providers. Dollar amounts are regularly inflation adjusted to the most recent year in the MVC registry. Details on the price-standardization and risk-adjustment methods can be found in the eAppendix. The secondary outcome in this study was the P4P score earned by the hospital during the 2 PYs. A detailed description of the algorithms to determine points in the P4P program can be found in the eAppendix.
Patient-level covariates were drawn from administrative claims data and included age, gender, insurance type, total price-standardized spending in the 6 months prior to admission, and comorbidities defined according to Hierarchical Condition Categories (HCCs), including neurological, eye, diabetes, kidney, cerebrovascular, lung, vascular, cancer, and heart. The CHF and AMI models did not include additional adjustment for heart-related HCC comorbid conditions due to collinearity. Hospital characteristics were abstracted from American Hospital Association Annual Survey data sets (2015-2018), including teaching status, bed size, and core-based statistical area category. Hospital teaching status was categorized as nonteaching, minor teaching (certified residency program), and major teaching (certified residency program and a member of the Council of Teaching Hospitals and Health Systems). Hospital bed size was categorized as less than 100 beds, 100 to 299 beds, or 300 or more beds. The hospital core-based statistical area was used to categorize hospitals as rural, metropolitan, and micropolitan. Hospital fixed effects were also included in the model.
Hospital episode spending performance. The first analysis sought to evaluate hospital performance in the P4P program. Each of the 74 hospitals that participated in the P4P program during this period selected 2 condition-based episodes that would cover both PYs for a total sample of 148 selections. Hospital-level mean 30-day episode spending was estimated for the baseline and performance years, as was the absolute difference between baseline and performance years for each selected episode in PY 2018 and PY 2019. Generalized linear models were used to estimate the change (and 95% CI) in hospital-level price-standardized, risk-adjusted 30-day episode spending for all 148 selections, and they were stratified by the each of the individual condition-based episodes. Changes in specific components of the overall episode of care for PY 2018 and PY 2019 were also examined, including payments for the index hospitalization, professional services, postacute care services, and readmissions.
Hospital P4P points. Univariate analyses described the distribution of P4P points earned by hospitals in PY 2018 and PY 2019. Bivariate analyses compared P4P points earned by hospital characteristics, including hospital teaching status, bed size category, and core-based statistical area location using analysis of variance.
Potential sources of success. Potential sources of success in the program were evaluated in several ways. First, hospitals may have been more likely to succeed if they had high baseline spending and thus more opportunity to reduce spending, or regress to the mean.17 Hospitals were therefore categorized as having high baseline spending if they had mean episode spending higher than the median during the baseline period. Second, hospitals could have avoided high-spending episodes by targeting the payer or comorbidity mix of their patients.18 Binary variables were created to indicate whether a hospital decreased from baseline to performance year in the following areas: (1) the proportion of episodes attributed to Medicare beneficiaries, (2) the case mix index according to the mixture of diagnosis-related group hospitalization weights, and (3) the mean number of comorbidities per episode as defined by HCCs. Third, hospitals could have focused on reducing specific components of episode spending, including the index hospitalization, postdischarge care, professional services, or readmissions.3,4 Hospitals were categorized as having reduced mean episode spending change from baseline to performance year vs no reduction. Cochrane-Armitage tests of trend compared potential sources of success according to performance categories of 0 to 1 point earned, 2 to 3 points earned, and 4 to 5 points earned.
Effectiveness of the P4P program. A DID approach was used to evaluate the effect of the BCBSM incentive program on patient-level 30-day price-standardized total episode payments. The treatment group included patient-level episodes for conditions selected by their admitting hospital, and the control group included patient-level episodes for conditions not selected by their admitting hospital. Generalized linear models were created with DID terms for time (pre- vs postimplementation time periods), P4P condition-based episode selection (treatment vs control group), and an interaction between time and P4P selection (the DID estimator). Risk adjustment was done using an inverse probability of treatment weighting (IPTW) method using patient-level covariates and hospital fixed effects. From these models, the DID estimator of the incentive program was estimated for each of the 7 condition-based episodes, and 95% CIs and P values for these estimators were calculated. Parallel trends in spending before implementation of the program were tested based on an interaction term between time in months and P4P selection (ie, treatment vs control group) limited to the preimplementation period using a generalized linear model for each condition. Models were adjusted for patient and hospital factors and included hospital fixed effects. Sensitivity analyses included using individual covariate adjustment instead of IPTW and separating the postimplementation period into 2 individual periods by year (ie, preimplementation vs 2017 and 2018). Statistical significance was set at α = 0.05. Analyses were conducted using SAS version 9.4 (SAS Institute) and Stata version 15.0 (StataCorp).
Hospital Episode Spending Performance
There were 74 hospitals participating in the BCBSM P4P initiative for both PY 2018 and PY 2019, selecting a total of 148 condition-based episodes for the program. The most frequently chosen condition-based episode was CHF (n = 51), followed by joint arthroplasty (n = 44), pneumonia (n = 18), AMI (n = 13), CABG (n = 9), spine surgery (n = 7), and colectomy (n = 6). Histograms illustrating the distribution of changes in hospital-level mean overall episode spending can be found in Figure 1. Hospital-level mean episode spending declined significantly from baseline to performance years in PY 2018 (–$671; 95% CI, –$1113 to –$230) but were not significantly different for PY 2019. Stratified by condition-based episode, in PY 2018, there were significant decreases in mean total episode spending for AMI (–$1854; 95% CI, –$3335 to –$373), colectomy (–$4443; 95% CI, –$8542 to –$345), and joint arthroplasty (–$1214; 95% CI, –$1825 to –$603), but there were no significant changes for CHF, pneumonia, CABG, or spine surgery. In PY 2019, there was a significant increase in total episode spending for CHF as compared with 2017 ($858; 95% CI, $388-$1328) but no significant changes for the other conditions. Baseline, performance, and changes in episode spending for PY 2018 and PY 2019 stratified by condition-based episode can be found in eAppendix Table 1. Changes in component episode spending for PY 2018 and PY 2019 can be found in Figure 2. Declines in spending for postdischarge care occurred across most conditions in PY 2018. In PY 2019, most condition-based episodes saw increases in spending for index admissions.
Hospital P4P Points
Hospitals earned a mean (SD) total of 6.3 (3.1) points in PY 2018 and 4.5 (2.9) points in PY 2019 out of 10 points possible (Table 1). The distribution of overall P4P points earned for both PYs can be found in eAppendix Figures 1 and 2. There were significant differences in the number of points earned across conditions in both PY 2018 and PY 2019. There was no significant association between hospital characteristics and P4P points earned in PY 2018. However, in PY 2019, rural hospitals earned significantly fewer points than either metropolitan or micropolitan hospitals (P = .014).
Potential Sources of Success
Table 2 evaluates the hypothesized explanations for differential performance in P4P scoring. High baseline spending was not associated with higher P4P scoring in PY 2018 but was significantly more common among high-scoring hospitals in PY 2019. In both PY 2018 and PY 2019, there were significant differences in case mix index changes, associated with greater likelihood of achieving high P4P scores. The highest-scoring hospitals were more likely to exhibit significant decreases in spending for all components of the episode (index admission, professional services, postdischarge care, and readmissions).
Effectiveness of the Program
The results of the patient-level DID analysis are shown in Table 3. No nonparallel trends in total episode spending were evident during the preintervention period in any of the service lines, as none of the interaction terms between time and treatment group were statistically significant (eAppendix Table 2). There were no significant overall mean differences in the trends in 30-day episode spending from performance to baseline years, comparing condition-based episodes that were selected for P4P against those not selected. Using individual covariate adjustment instead of IPTW yielded similar results (eAppendix Table 3). When split into 2 separate time periods, significant DID effects were found for joint arthroplasty and CHF for the 2017 time period, and for AMI for the 2018 time period, but no other significant effects were found (eAppendix Tables 4 and 5).
After introduction of a statewide, commercial episode-based spending incentive, there were significant reductions in overall spending in the first year of the program, but not the second. However, even the savings that were observed did not seem uniquely attributable to the incentives, as there was no observable association between hospitals’ spending changes and the condition-based episodes they selected for participation.5,19 As previously reported in studies of pilot federal bundled payment programs, most hospitals reduced their spending on postacute care after hospitalization in this period. However, savings in the index admission, professional billing, and readmissions more often differentiated those that earned the highest achievement points in this P4P metric.
Hospital responses to commercial insurers’ incentives may differ from those of federal bundled payment programs for a variety of reasons. With younger, healthier patients, differences in regulatory requirements, contracting, market pressures, and care networks, commercial insurers’ episode spending patterns are very different from those of Medicare.11 There may be far wider discretion to control postacute care choices and incentives, and the burden of spending on hospital care may be concentrated in different admitting conditions. Hospitals might achieve even greater savings in commercial programs, as they can narrow postacute provider networks, steer patients to preferred sites, or align member incentives with spending goals. Hospitals involved in both commercial and federal programs may realize synergies or spillovers between their strategies for each, or they might incur unintended consequences through incentivizing low-acuity admissions and avoidance of patients with greater disease burden or socioeconomic stress. Although early findings from federal episode-based payment programs have suggested opportunities for savings, especially in postacute care,5,19 it was not known how these experiences would translate to this commercial P4P metric in Michigan.
The findings in this study offer several novel insights for emerging commercial payment incentives. First, savings in this program were inconsistent with reports from federal programs, including the Medicare Bundled Payments for Care Improvement (BPCI) initiative.5,19 For many hospitals, commercial programs are smaller in magnitude than those that affect reimbursements for Medicare, and perhaps this incentive did not command the same level of attention and effort toward achievement. Second, the significant overall savings seen in the first year of the program did not persist in the second. Although this could be due to regression to the mean, it is also possible that baseline year for second-year performance comparisons could have already reflected some value-focused practice changes and masked savings that persisted through the performance period. In fact, hospitals that scored highest in the second year were more likely to have had high baseline spending in the year before the program, which has also been observed in Medicare’s BPCI initiative.17 It may also be that improvements became harder to make as the program progressed, although evidence supporting a “ceiling effect” in P4P programs has been mixed.20-22 Third, although many hospitals in bundled payment programs have focused on reducing potentially discretionary postacute care, the highest-scoring hospitals were more likely to have achieved savings not only in postacute care but also across the rest of the episode. Fourth, this program was mandatory, but hospitals did get the opportunity to select the 2 condition-based episodes on which they would be measured. In joint replacement bundles, there has been no clear difference in the spending outcomes of mandatory and voluntary participation,19 but wide variation in participants’ performance may be obscuring important lessons about the behavior of hospitals and providers in these programs.23 The lack of significant overall effect seen in the DID analyses suggest that hospitals were not able to succeed based on episode selection alone. Fifth, more savings have been identified in bundled payment programs for surgical admissions than for medical admissions in Medicare.24 However, such a pattern was not observed in this study, as there was no consistent difference by the type of episode.
Although the overall independent effect of this P4P program on episode spending appears minimal, there may be much to be learned from those hospitals that achieved significant savings. It has been observed that the overall average effect of P4P programs may obscure significant variation in performance within individual actors, and that not all institutions respond to these incentives strategically.23 In evaluating what have been commonly proposed mechanisms for success in episode-based incentives, however, no specific evidence supported gaming approaches or regression to the mean as the sources for apparent savings.25-28 Highest point-earning hospitals were more likely to have lower mean case mix index in the performance years, suggesting some decrease in patient complexity and risk. Ongoing explanatory qualitative studies are being conducted to fully understand the approaches that hospitals undertook and the strategies most associated with success among participating institutions.
Several limitations should be considered when interpreting these findings. First, changes in hospital-level spending are limited to pre- vs post comparisons without a concurrent, external control group. Thus, external or time-varying effects such as contemporaneous payment reform efforts may confound these findings. Patient-level DID analysis in hospitals that selected a condition-based episode compared with a concurrent control group sought to mitigate this limitation. Second, this study sought to define sources of success, but we were limited to observations from administrative data. Subsequent quantitative or qualitative studies among high- and low-performing hospitals may better identify and explain success in the P4P program. Third, spending data in the study represent price-standardized dollars instead of actual dollars, which limits the study’s ability to understand the actual realized costs or savings associated with the program. Further, this study did not have access to paid amounts from BCBSM, as those are contractually negotiated between hospitals and BCBSM. However, price-standardized dollars allow for direct comparisons of changes in health care utilization independent of concurrent changes in negotiated prices between providers and BCBSM. Finally, although the findings from this statewide P4P program, involving Michigan’s largest commercial insurer, offer insights for commercial incentives in other regions and with other payers, it is unknown how generalizable the experience may be to other states.
In an early experience of a commercial payer episode-based incentive program in Michigan, there was little evidence for systematic reductions in overall spending. However, the hospitals that achieved greatest savings and incentives in the program provide model insights into the ongoing design of hospital P4P metrics.
Author Affiliations: Michigan Value Collaborative (MPT, MLYK, CAP, JMY, JDS, HN, ECN, SER), Ann Arbor, MI; Department of Cardiac Surgery (MPT) and Department of Surgery (AHC-N, MLYK, CLA, JMY, JDS, HN, SER), Michigan Medicine, Ann Arbor, MI; Department of Health Management and Policy, School of Public Health (ECN) and Department of Economics (ECN), University of Michigan, Ann Arbor, MI.
Source of Funding: Donaghue Foundation Greater Value Portfolio, National Institute on Aging (K08-AG047252), and Blue Cross Blue Shield of Michigan (BCBSM) Value Partnerships Program.
Author Disclosures: Dr Thompson received partial salary support for his role as codirector of the Michigan Value Collaborative, which is funded by BCBSM. Ms Cain-Nielsen received salary support from BCBSM/Blue Care Network and the Michigan Department of Health and Human Services through grant funding of the Michigan Trauma Quality Improvement Program. Mr Syrjamaki, Ms Yaser, and Ms Yost Karslake are current employees of BCBSM but were not when working on this manuscript. The remaining authors report no relationship or financial interest with any entity that would pose a conflict of interest with the subject matter of this article.
Authorship Information: Concept and design (MPT, JDS, HN, ECN, SER); acquisition of data (MPT, MLYK, JMY, JDS); analysis and interpretation of data (MPT, AHC-N, MLYK, CAP, JMY, JDS, ECN, SER); drafting of the manuscript (MPT, SER); critical revision of the manuscript for important intellectual content (MPT, AHC-N, CAP, JMY, JDS, HN, ECN, SER); statistical analysis (MPT, AHC-N, MLYK, CAP, JDS, ECN); obtaining funding (SER); administrative, technical, or logistic support (MPT, JDS, HN); and supervision (JDS, SER).
Address Correspondence to: Michael P. Thompson, PhD, Michigan Medicine, 5331K Frankel Cardiovascular Center, 1500 E Medical Center Dr, SPC 5864, Ann Arbor, MI 48109. Email: email@example.com.
1. Buntin MB, Colla CH, Escarce JJ. Effects of payment changes on trends in post-acute care. Health Serv Res. 2009;44(4):1188-1210. doi:10.1111/j.1475-6773.2009.00968.x
2. Ackerly DC, Grabowski DC. Post-acute care reform—beyond the ACA. N Engl J Med. 2014;370(8):689-691. doi:10.1056/NEJMp1315350
3. McWilliams J, Gilstrap LG, Stevenson DG, Chernew ME, Huskamp HA, Grabowski DC. Changes in postacute care in the Medicare Shared Savings Program. JAMA Intern Med. 2017;177(4):518-526. doi:10.1001/jamainternmed.2016.9115
4. Dummit LA, Kahvecioglu D, Marrufo G, et al. Association between hospital participation in a Medicare bundled payment initiative and payments and quality outcomes for lower extremity joint replacement episodes. JAMA. 2016;316(12):1267-1278. doi:10.1001/jama.2016.12717
5. Finkelstein A, Ji Y, Mahoney N, Skinner J. Mandatory Medicare bundled payment program for lower extremity joint replacement and discharge to institutional postacute care: interim analysis of the first year of a 5-year randomized trial. JAMA. 2018;320(9):892-900. doi:10.1001/jama.2018.12346
6. Norton EC, Li J, Das A, Chen LM. Moneyball in Medicare. J Health Econ. 2018;61:259-273. doi:10.1016/j.jhealeco.2017.07.006
7. Dyrda L. Blue Cross Blue Shield of Michigan launches total joint bundled payment program with 64 surgeons: 4 key points. Becker’s Spine Review. March 22, 2018. Accessed December 1, 2021. https://www.beckersspine.com/orthopedic-spine-practices-improving-profits/item/40417-blue-cross-blue-shield-of-michigan-launches-total-joint-bundled-payment-program-with-64-surgeons-4-key-points.html
8. Butcher L. Bundled payment for bundles of joy. Managed Care. February 2018. Accessed December 1, 2021. https://lsc-pagepro.mydigitalpublication.com/publication/?i=470773&article_id=2994311&view=articleBrowser
9. Tyler DA, McHugh JP, Shield RR, Winblad U, Gadbois EA, Mor V. Challenges and consequences of reduced skilled nursing facility lengths of stay. Health Serv Res. 2018;53(6):4848-4862. doi:10.1111/1475-6773.12987
10. Huckfeldt PJ, Escarce JJ, Rabideau B, Karaca-Mandic P, Sood N. Less intense postacute care, better outcomes for enrollees in Medicare Advantage than those in fee-for-service. Health Aff (Millwood). 2017;36(1):91-100. doi:10.1377/hlthaff.2016.1027
11. Regenbogen SE, Cain-Nielsen AH, Syrjamaki JD, Chen LM, Norton EC. Spending on postacute care after hospitalization in commercial insurance and Medicare around age sixty-five. Health Aff (Millwood). 2019;38(9):1505-1513. doi:10.1377/hlthaff.2018.05445
12. The Hospital Value-Based Purchasing (VBP) program. CMS. Updated December 1, 2021. Accessed February 1, 2022. https://www.cms.gov/Medicare/Quality-Initiatives-Patient-Assessment-Instruments/Value-Based-Programs/HVBP/Hospital-Value-Based-Purchasing
13. Collaborative quality initiatives. Value Partnerships. Accessed December 1, 2021. https://www.valuepartnerships.com/programs/collaborative-quality-initiatives/
14. Ellimoottil C, Syrjamaki JD, Voit B, Guduguntla V, Miller DC, Dupree JM. Validation of a claims-based algorithm to characterize episodes of care. Am J Manag Care. 2017;23(11):e382-e386.
15. Gottlieb DJ, Zhou W, Song Y, Andrews KG, Skinner JS, Sutherland JM. Prices don’t drive regional Medicare spending variations. Health Aff (Millwood). 2010;29(3):537-543. doi:10.1377/hlthaff.2009.0609
16. Birkmeyer JD, Gust C, Baser O, Dimick JB, Sutherland JM, Skinner JS. Medicare payments for common inpatient procedures: implications for episode-based payment bundling. Health Serv Res. 2010;45(6, pt 1):1783-1795. doi:10.1111/j.1475-6773.2010.01150.x
17. Berlin NL, Gulseren B, Nuliyalu U, Ryan AM. Target prices influence hospital participation and shared savings in Medicare bundled payment program. Health Aff (Millwood). 2020;39(9):1479-1485. doi:10.1377/hlthaff.2020.00104
18. Navathe AS, Liao JM, Dykstra SE, et al. Association of hospital participation in a Medicare bundled payment program with volume and case mix of lower extremity joint replacement episodes. JAMA. 2018;320(9):901-910. doi:10.1001/jama.2018.12345
19. Liao JM, Gupta A, Zhao Y, et al. Association between hospital voluntary participation, mandatory participation, or nonparticipation in bundled payments and Medicare episodic spending for hip and knee replacements. JAMA. 2021;326(5):438-440. doi:10.1001/jama.2021.10046
20. Lindenauer PK, Remus D, Roman S, et al. Public reporting and pay for performance in hospital quality improvement. N Engl J Med. 2007;356(5):486-496. doi:10.1056/NEJMsa064964
21. Ryan AM, Burgess JF Jr, Pesko MF, Borden WB, Dimick JB. The early effects of Medicare’s mandatory hospital pay-for-performance program. Health Serv Res. 2015;50(1):81-97. doi:10.1111/1475-6773.12206
22. Sutton M, Nikolova S, Boaden R, Lester H, McDonald R, Roland M. Reduced mortality with hospital pay for performance in England. N Engl J Med. 2012;367(19):1821-1828. doi:10.1056/NEJMsa1114951
23. Markovitz AA, Ryan AM. Pay-for-performance: disappointing results or masked heterogeneity? Med Care Res Rev. 2017;74(1):3-78. doi:10.1177/1077558715619282
24. Joynt Maddox KE, Orav EJ, Zheng J, Epstein AM. Evaluation of Medicare’s bundled payments initiative for medical conditions. N Engl J Med. 2018;379(3):260-269. doi:10.1056/NEJMsa1801569
25. Li Y, Ying M, Cai X, Thirukumaran CP. Association of mandatory bundled payments for joint replacement with postacute care outcomes among Medicare and Medicaid dual eligible patients. Med Care. 2021;59(2):101-110. doi:10.1097/MLR.0000000000001473
26. Ryan AM. Will value-based purchasing increase disparities in care? N Engl J Med. 2013;369(26):2472-2474. doi:10.1056/NEJMp1312654
27. Barnett ML, Mehrotra A, Grabowski DC. Postacute care—the piggy bank for savings in alternative payment models? N Engl J Med. 2019;381(4):302-303. doi:10.1056/NEJMp1901896
28. Jha AK. Value-based purchasing: time for reboot or time to move on? JAMA. 2017;317(11):1107-1108. doi:10.1001/jama.2017.1170