Value-Based Payment in Implementing Evidence-Based Care: The Mental Health Integration Program in Washington State
Yuhua Bao, PhD; Thomas G. McGuire, PhD; Ya-Fen Chan, PhD; Ashley A. Eggman, MS; Andrew M. Ryan, PhD; Martha L. Bruce, PhD, MPH; Harold Alan Pincus, MD; Erin Hafer, MPH; and Jürgen Unützer, MD, MPH, MA
Despite extraordinary increases in medical knowledge, healthcare in the United States frequently falls short of evidence-based standards.1 Implementation fidelity is defined as the degree to which interventions or programs are implemented as intended by the program developers2; and poor implementation fidelity is one explanation of why the promise of evidence-based medicine remains unfulfilled.3,4 Substantial variation among providers exists in the intensity of implementation and degree of fidelity to the evidence base,5-8 and current financial incentives in the US healthcare system contribute to a poor “business case” to adopt evidence-based practices in an effective manner.9 Targeted financial incentives have the potential to improve fidelity and improve implementation effectiveness.
Value-based payment (VBP)—a form of pay for performance—incentivizes quality and outcomes of care by tying payment to providers with predefined quality or efficiency targets. VBP has been widely adopted by private and public payers, with recent examples including Medicare’s launch of the hospital VBP program in 201310 and the HHS secretary's announcement of measurable goals and a timeline to move the healthcare system at large toward quality-based payment.11 Most existing VBP programs provide standalone financial incentives without a support system to providers for care process redesign. VBPs, embedded in evidence-based care and designed to improve implementation effectiveness, are rare12-15 and not extensively studied.
In this study, we assessed if a VBP component could improve the effectiveness of Collaborative Care Model (CCM) implementation. The CCM is a team-based approach to treating depression and other common behavioral health conditions in primary care,16-18 with the team including a primary care physician, care manager, and a consulting psychiatrist. The key principles of the model include systematic follow-up on patients by the care manager, measurement-based care that uses symptom-rating scales to track clinical improvements or identify patients not improving,19 and “stepped care,”20 where treatment is systematically adjusted or intensified by the primary care team (with input from the consulting psychiatrist) for patients not improving.21 CCM implementation has gained momentum: in its proposed rules for the 2016 Physician Fee Schedule, CMS stated an intention to modify current payment to cover the CCM.22
We conducted our assessment in the context of the Mental Health Integration Program (MHIP) of Washington state.23 The MHIP is an ongoing, publicly funded implementation of CCM in a diverse network of community health clinics across Washington. Started in 2008 in the 2 most urban counties in the state, it is now statewide and has served over 35,000 individuals. The VBP component of the MHIP payment started in 2009 in response to substantial variation in quality of care and patient outcomes seen in the first year. It adopted some best practice design features of VBP24-26: close mapping of quality measures with key elements of the evidence-based model, substantial incentive payment size (25% of total payments to providers), and a dynamic set of quality measures and targets that are adjusted over time. Thus, a unique opportunity is offered to assess VBP's role in improving implementation effectiveness of evidence-based care in community settings.
We hypothesized that MHIP VBP improved fidelity to the CCM, as measured by key process-of-care elements of the model, both directly and not explicitly incentivized by the VBP. We further hypothesized that MHIP VBP improved patient depression outcomes. Provider organizations with a larger patient panel have more at stake under a VBP scheme and greater resources to invest in quality improvement in response to VBP.27 Providers with a lower level of performance at baseline have more room and motivation to improve.28-30 We thus hypothesized that the effect of VBP on implementation fidelity was greater among clinics with a larger MHIP patient caseload and among clinics with a lower level of fidelity prior to VBP.
METHODS Study Period, Population, and Data
This study focused on phase 1 of the MHIP VBP. Phase 1 (covering year 2009) used 4 process-of-care targets mapping closely with the principles of the CCM (Table 1). Provider organizations would initially receive 75% of their total payment for the CCM (ie, with 25% holdback) and receive 5% for achieving each of the 4 targets in a calendar quarter, with an additional 5% being awarded for participation. Adjustments to the VBP scheme were made in subsequent phases by raising benchmarks for existing targets, eliminating targets that had been achieved by most provider organizations, and/or adding new targets to address emerging gaps in quality, thus providing incentives for continuous improvement.
Our study population comprised patients 18 years or older who initiated care in the MHIP between January 1, 2008, and June 30, 2009, in 1 of the 35 community health clinics that started MHIP implementation in 2008—therefore, they had experience with MHIP both before and after the launch of phase 1 VBP. We restricted patient enrollment in the MHIP to June 30, 2009, to ensure that the first 6 months of care for every patient enrolled in 2009 were under the influence of phase 1 (not phase 2) of VBP. All 35 clinics were located in King and Pierce counties—the 2 most populous counties in Washington state—and were affiliated with 7 community health centers. These community health centers were the parent organizations of the clinics and were Federally Qualified Health Centers. The populations they served were primarily patients with Medicaid, other state-funded programs, and patients who were uninsured. Clinical social workers, psychologists, licensed mental health counselors, and other clinicians staffed at the clinics served as the CCM care managers in the MHIP.
Patient inclusion criteria included a baseline Patient Health Questionnaire ([PHQ-9] with a possible range of 0-27)31 score of 10 or greater—indicating clinically significant depression—and at least 1 follow-up contact with the MHIP care manager within 24 weeks of the initial contact in order to allow at least 1 chance to assess depression outcomes. Patients whose last contact with MHIP occurred within 1 week from the first contact were further excluded as they likely were determined ineligible for MHIP. The vast majority of the MHIP patients during our study period were enrollees in Washington state’s Disability Lifeline Program. These patients were temporarily disabled because of a physical or mental health condition and expected to be unemployed for 90 days or more. King County extended eligibility to additional patient populations including low-income mothers and their children, low-income older adults, uninsured, veterans, and veterans’ family members.
Our data were from the Web-based registry32 used by all MHIP participating clinics to systematically document care management activities and clinical outcomes and to assist with population management.
Three dichotomous measures at the patient-month level captured fidelity to major domains of the CCM (Table 1). “At least 1 follow-up contact with the care manager” reflects the principle of systematic follow-up; “at least 1 psychiatric consultation” reflects the principle of stepped care—the idea that treatment should be systematically changed or intensified for patients not responding to initial treatment. An important mechanism by which stepped care is operationalized in CCM is through consultation with a mental health specialist (usually a psychiatrist) for potential treatment changes. These 2 measures were closely related to the 2 quality targets in phase 1 of MHIP VBP (Table 1). “At least 1 PHQ-9 assessment” reflects the principle of measurement-based care, whereby treatment teams use symptom rating scales to systematically track clinical improvements or lack thereof. This measure was not explicitly incentivized in MHIP VBP. Data on current medications (also documented in the MHIP registry) are not available for research at this point. We therefore had no fidelity measure mapping the fourth VBP measure, documentation of current psychiatric medication in registry for 75% of cases (Table 1). Each fidelity measure was assessed for each 4-week interval starting from the patient’s initial contact with the MHIP care manager, up to 24 weeks or until the patient’s last contact with the care manager, whichever occurred first.
We tested hypotheses about 2 potential modifiers of the effects of VBP on fidelity: size of the MHIP patient panel, measured by cumulative number of patients treated at the clinic prior to phase 1 VBP (ie, in 2008); and clinic-level fidelity at baseline, measured by the average count of follow-up contacts, psychiatric consultation, or PHQ assessments over the first months of all patients treated at the clinic in 2008.
A clinically significant improvement in depression was defined as achieving a follow-up PHQ-9 score under 10 or achieving at least 50% reduction in PHQ-931 within 24 weeks of initial care manager contact.
The rolling enrollment of patients in the MHIP throughout our study period created a natural experiment in the sense that exposure of patient episodes of care to VBP resembled random assignment. We compared the fidelity outcomes for patient-months exposed to VBP with patient-months not exposed to VBP. We estimated a multi-level linear probability model for each fidelity outcome with a random intercept at the patient level to account for clustering of months of the same patient. The key independent variable was a dichotomous indicator of VBP exposure, defined as 1 if the index patient-month started after January 1, 2009—when phase 1 VBP took effect—and 0 otherwise. Because care management activities were more intensive in the early months of a patient’s treatment episode,33 our adjusted analysis controlled for a set of dummy variables indicating whether the index month was the patient’s first through sixth month in MHIP. We conducted several sensitivity analyses. We restricted the sample to Disability Lifeline Program enrollees, which accounted for 85% of the entire sample. We also estimated the linear probability model with a fixed effect for each patient, first with the entire sample (ie, regardless of whether a patient had exposure to VBP), then restricting to patients who contributed months both pre- and post VBP during the first 24 weeks of care. The latter approach allowed us to examine the effects of VBP on treatment fidelity within the same patient, but was conducted with a much smaller sample size (about one-third of the original sample), and thus subject to lower precision in estimates. We also estimated the logistic version of each model and compared implications (eg, marginal effect of VBP) based on both sets of analyses.
Hypotheses regarding the modifying effects were tested by adding interaction terms between the VBP indicator and the hypothesized modifier to the models. We included an additional interaction between VBP and the quadratic term of the modifier (eg, cumulative caseload squared) to allow for nonlinear effects. The VBP incentives were directed at the community health centers (also known as “implementation sites” in MHIP)—7 in total in our study sample; each site had multiple clinics. We conducted sensitivity analysis where hypothesized modifiers (caseload and baseline fidelity) were measured at the site level.
To assess the association between VBP and depression outcomes, we estimated an extended Cox proportional hazard model of time to improvement in depression, censored at 24 weeks after the initial assessment/contact, or the patient’s last contact with the MHIP care manager, whichever occurred first. The key independent variable was an indicator of VBP exposure (ie, 1 after January 1, 2009, and 0 otherwise). This indicator varied over time (switched from 0 to 1) for patients who enrolled in MHIP in 2008, but whose observation period ended in 2009. We conducted tests of the proportional hazard assumption based on the Schoenfeld residuals.34
Adjusted analyses controlled for baseline patient age and gender, baseline PHQ-9 scores and comorbid behavioral health conditions, MHIP eligibility categories, and clinic fixed effects (to control for between-clinic differences in quality of care). To control for possible clinic learning over time, we also included the number of months the clinic had been participating in MHIP at the time the index patient was enrolled in MHIP and its quadratic and cubic terms.
RESULTS Table 2 presents descriptive statistics of baseline patient characteristics for the entire sample (n = 1806) and by whether a patient had at least 1 month of exposure to VBP within 24 weeks since their first contact with MHIP care managers. Patients with no exposure to VBP were more likely to be enrollees in the Disability Lifeline program than patients with at least 1 month of exposure (96.2% vs 79.7%). Partly because of this difference, patients with no exposure were more likely to be aged between 40 and 59 years and more likely to have a PHQ-9 score of 20 or higher (indicating severe depression symptoms) compared with patients without any exposure. Prevalence of comorbid behavioral conditions was comparable between the 2 cohorts, except that rates of anxiety and bipolar disorders were slightly lower among the no-exposure group than the exposed group.
For the fidelity outcomes, results of multivariate analyses were presented in the form of predicted probabilities with and without exposure to VBP and the marginal effect of VBP (Table 3). (The mixed-effects linear probability models and their logistic counterparts generated very similar results; results reported hereafter and in Table 3 were based on the linear probability models.) Based on analyses conducted with the entire sample, the effect of VBP on the probability of at least 1 follow-up contact, psychiatric consultation, and PHQ-9 assessment in a month was an increase of 0.05 (95% confidence interval (CI), 0.00-0.10; P <.05), 0.04 (95% CI, 0.00-0.07; P <.05), and 0.07 (95% CI, 0.02-0.11; P <.05), respectively. The magnitude of the increase was about 9%, 30%, and 15% of the respective fidelity outcome had there been no exposure to VBP. Analysis restricting to Disability Lifeline enrollees (85% of the unrestricted sample) generated very similar results, with slightly smaller marginal effects of VBP for follow-up contacts and PHQ assessments, but slightly greater marginal effect for psychiatric consultation (Table 3).
Sensitivity analysis of the fidelity outcomes by controlling for patient fixed effects produced similar or slightly greater point estimates of the marginal effects of VBP (eAppendix Table [eAppendices available at ajmc.com]). However, fixed-effects analysis, restricted to patients who received care both before and after VBP, had a much-reduced sample size (359 patients compared with 1806 in the unrestricted fixed-effects analysis). Marginal effects of VBP were not statistically significant for follow-up contacts or psychiatric consultation, but remained strong for PHQ assessments (marginal effect of VBP: 0.09; 95% CI, 0.02-0.16).
Our analysis indicated that both the size of the MHIP caseload at a clinic and the level of fidelity prior to VBP modified the effect of the VBP. As shown in eAppendix Figure A, for follow-up contacts and PHQ assessments, the marginal effect of VBP increased with the number of patients treated at the clinic prior to VBP. For follow-up contacts, the marginal effect of VBP did not achieve statistical significance until the number of patients treated at clinic in 2008 was 100 or more (top 25% of clinics); and for PHQ assessments, not until at least 140 (top 10% of clinics). Caseload did not seem to modify the VBP effect for psychiatric consultation. On the other hand, for each fidelity measure, the marginal effect of VBP decreased with the level of fidelity at the clinic prior to the start of VBP (eAppendix Figure B). For example, the effect of VBP on follow-up contacts was significantly greater than 0 only among clinics whose first-month follow-up contacts in 2008 averaged below 0.8 (accounting for 75% of all clinics). Sensitivity analysis defining modifiers at the implementation site level produced largely consistent findings. One exception was that, for psychiatric consultation, the VBP effect seemed to decrease with the size of MHIP caseload at the site level, whereas there was no clear modifying effect of caseload defined at the clinic level.
Consistent with results for the fidelity outcomes, exposure to VBP was associated with an adjusted hazard ratio (HR) of 1.45 (95% CI, 1.04-2.03) for achieving clinically significant improvement in depression, indicating that exposure to VBP was associated with a shorter time to improvement. This result held when we restricted the sample to Disability Lifeline patients (adjusted HR, 1.47; 95% CI, 1.03-2.12).
With data from a statewide implementation of the CCM in community health clinics, we found that a VBP program embedded in community-based implementation improved fidelity to several key process-of-care elements of the evidence-based model, both directly incentivized and not explicitly incentivized by the VBP. Consistent with our hypotheses, we also found stronger responses to VBP among provider organizations that cared for a larger number of patients and among organizations with a lower level of initial fidelity. Finally, we found that VBP led to better patient outcomes indicated by a shorter time to clinically significant improvement in depressive symptoms.
Our VBP effect findings contrast with the limited evidence supporting the effectiveness of existing VBP programs.15,25,35-37 Several reasons may underlie the differences. First, the MHIP paired financial incentives with chronic care quality improvement and capacity-building efforts. An expert team at the University of Washington provided training to care managers from all participating community health clinics and made archived training materials available online, and consulting psychiatrists were arranged in a contractual relationship to work with all participating clinics. Implementation of a clinical tracking system—a crucial tool to enable population health management and case tracking—was a precondition for receiving funding for MHIP and was achieved at all clinics. Meanwhile, existing VBP contracts typically provide no support system for quality improvement. Second, the MHIP VBP targeted several key elements of a single evidence-based care model, focusing improvement efforts and sending strong signals and clear directions to provider organizations on what to improve. Existing VBP programs typically contain a large number of quality targets that may not be clinically meaningful, dissipating incentives and failing to engage clinicians.38
Monthly PHQ assessments—the measure not explicitly incentivized under MHIP VBP—improved by 15% in response to VBP. Because PHQ assessment was conducted at follow-up contacts with the care manager, incentivizing systematic follow-up may have had the “spillover effects” of incentivizing these assessments. A direct implication is that, for fidelity/process-of-care measures that complement one another, designers of VBP programs may consider recognizing some, but not all of them, as VBP targets. Keeping the target set parsimonious (and thus not diluting incentives) may not need to come at the price of forsaking important quality goals.
Clinics with a larger patient caseload responded to VBP more in 2 of 3 fidelity measures considered, suggesting that smaller clinics may perceive insufficient incentives because of the limited scope of their VBP payment25 and/or may lack the resources to make systematic changes to care in response to VBP. To ensure that provider organizations of all sizes (and their patients) benefit from VBP, implementation initiatives may consider pooling resources, for example, by establishing learning collaboratives and providing coaching and consultation to organizations in need. Consistent with our hypothesis and findings of previous studies,25,28-30 lower baseline fidelity was associated with greater improvement in fidelity in response to VBP, thus reducing the variation in fidelity/quality among provider organizations implementing the evidence-based model. Although a desirable outcome, it also reveals the fact that, with a single performance target for all providers, high-performers may not be adequately motivated to improve further even though there is still room for improvement. One option would be to have 2 sets of thresholds to be applied to provider organizations with different initial levels of performance.39 This option, however, adds to the complexity of the VBP and may be perceived unfair or unacceptable, especially by high-performing provider organizations.
We assessed the effects of VBP in a natural experiment, not a randomized trial. Although we controlled for important differences in patient baseline characteristics, differences among clinics that do not change over time (with clinic fixed effects), and proxies for provider learning over time, these controls were not perfect. However, a series of sensitivity analyses demonstrated the robustness of our findings. The fidelity and patient outcomes we examined were subject to data availability and usability; we were not able to examine antidepressant management as an important fidelity outcome at this point, but intend to do so in the future when data become available. We used data from the implementation of a specific care management model in a single state, which potentially limited the generalizability of our findings to other evidence-based care approaches or other geographic areas. This limitation, however, is mitigated by the fact that the CCM is highly consistent with the Chronic Care Model40,41 and that the MHIP involved a large and diverse set of provider organizations.
Our study provided strong evidence that a VBP component adopting best practices of VBP design and being embedded in an implementation initiative is effective in improving fidelity to key elements of the evidence-based model, both directly and not directly incentivized by the VBP, and, in turn, improving patient outcomes.