Relationships Between Provider-Led Health Plans and Quality, Utilization, and Satisfaction

Natasha Parekh, MD, MS; Inmaculada Hernandez, PharmD, PhD; Thomas R. Radomski, MD, MS; and William H. Shrank, MD, MSHS

As healthcare providers accept increasing financial risk in alternative payment models, more provider organizations are expected to operate their own health insurance plans, known as provider-led health plans (PLHPs). Over the past 2 decades, PLHPs have become increasingly popular in the United States, with more than 100 plans covering more than 15 million individuals.1,2 Often referred to as vertical integration, the integration of healthcare providers and payers offers potential advantages to the patient, provider, and system. By inherently aligning payer–provider incentives and managing healthcare across a continuum of services, PLHPs may be particularly advantageous in population health management and, therefore, may have superior outcomes with lower premiums compared with non-PLHPs.1-8

Our knowledge of the impact of PLHPs on outcomes remains limited and inconsistent.1-14 For instance, critics of PLHPs argue that they are not consistently associated with higher-quality healthcare and can lead to increased costs due to greater market power and administrative costs.10-14 Furthermore, it remains unknown how PLHP characteristics, including size, region, and nonprofit status, may affect outcomes. For example, nonprofit plans may perform better than for-profit plans,15 and larger plans may perform better than smaller plans through increased experience.

The objectives of this study were therefore to (1) determine the association between PLHP status and healthcare quality, utilization, and patient satisfaction and (2) determine whether these associations differed by plan size, nonprofit status, and region.


We conducted an observational study of Medicare Advantage (MA) contracts using December 2016 MA enrollment data from CMS. We focused on MA due to its large population and available outcome data that allow for standardized comparisons.16 We identified all MA contracts offered in 2016 with more than 20,000 enrollees to increase generalizability. For each contract, we obtained information on 3 quality outcomes, 4 utilization outcomes, and 1 patient satisfaction outcome. The quality outcomes were the 2017 MA Star Rating System (5-star maximum); 2016 Healthcare Effectiveness Data and Information Set (HEDIS) effectiveness aggregate score, defined as the average of 55 HEDIS Effectiveness of Care measures (100% maximum); and 2016 HEDIS access aggregate score, defined as the average of 2 HEDIS Access of Care measures (100% maximum). The utilization outcomes were 2016 HEDIS measures and included procedure rates, defined as average procedure rates per 1000 members for 13 selected procedures; discharge rates, defined as risk-adjusted discharges per 1000 members; inpatient days, defined as inpatient days per 1000 member-months; and risk-adjusted readmission probability. The patient satisfaction outcome was the 2016-2017 National Committee for Quality Assurance consumer satisfaction ratings, which are based on 2016-2017 Consumer Assessment of Healthcare Providers and Systems (CAHPS) surveys (5-point maximum) (see eAppendix Table 1 for outcome details [eAppendix available at]).

We categorized each MA contract as belonging to a PLHP or non-PLHP based on a publicly available list from the Robert Wood Johnson Foundation,10 which we enhanced to include additional PLHPs based on lists from McKinsey and Avalere (eAppendix Table 2).1,2,10,17 When there was uncertainty about PLHP status, we conducted an internet search to verify. We obtained region and profit status from the December 2016 MA enrollment list and patient risk from 2015 CMS plan payment data.18,19

To compare how outcomes differed between PLHP and non-PLHP contracts, we constructed multivariable linear regression models using generalized estimating equations with exchangeable correlation matrices to account for correlation between contracts within health plans. For example, Aetna’s health plan offered 25 MA contracts in our data set. The model controlled for accessible covariates identified as meaningful from existing literature,1,6,15 including MA region, contract profit status, average MA patient risk score, and the following covariates, all of which were derived from Area Health Resources Files20 and weighted for county contract enrollment: percent urban residence, percent black/African American, mean per capita income, college education among population 25 years or older, percent poverty among population 65 years or older, population 65 years or older per 1000 population, hospital beds per 1000 population, and active physicians per 1000 population. Each contract was analytically weighted by enrollee number.

We conducted subgroup analyses to evaluate how the association between PLHP contracts and outcomes differed by PLHP size, profit status (for-profit vs nonprofit), and MA region. To assess outcome differences by size, we compared outcomes of the 6 PLHPs with at least 100,000 enrollees (Kaiser Permanente, UPMC, Healthfirst, Spectrum, Innovacare, and Tufts) with those of the remaining PLHPs. To assess PLHP effects stratified by region, we mapped our model results for each MA region, differentiating areas where PLHPs performed significantly better than non-PLHPs, worse than non-PLHPs, or where there was no difference. Subgroup analyses were based on the multivariable model above, except for regional analyses. Regional analyses only adjusted for profit status and patient risk score because the inclusion of additional covariates prevented the model from producing estimates for many regions. Finally, to explore whether our findings were driven by Kaiser Permanente, a notably high-quality plan, we ran our base-case models after excluding Kaiser Permanente contracts.

Analyses were performed using Stata 14 (StataCorp LP; College Station, Texas) and SAS 9.4 (SAS Institute Inc; Cary, North Carolina). Further information on data sources, variable definitions, and missing data is available in eAppendix Table 1 and eAppendix Table 3.

Our study population included 64 contracts offered by 31 PLHPs (representing 3,197,284 enrollees) and 311 contracts offered by 55 non-PLHPs (representing 13,881,210 enrollees) (Table 1). Unadjusted mean star ratings, effectiveness, access, and patient satisfaction were higher among PLHPs compared with non-PLHPs, whereas procedure rates, inpatient discharges, and inpatient days were lower.

In adjusted models, PLHPs were associated with higher star ratings (β = 0.41; 95% CI, 0.15-0.67), effectiveness (β = 3.11; 95% CI, 1.43-4.80), and patient satisfaction (β = 0.57; 95% CI, 0.30-0.84) compared with non-PLHPs. Procedure rates were lower for PLHPs than non-PLHPs (β = –0.47; 95% CI, –0.79 to –0.16). There were no significant differences in access, inpatient discharges, inpatient days, and readmission probability (Table 1).

Table 2 illustrates the results from subgroup analyses, which demonstrated that larger PLHPs had significantly higher star ratings (β = 0.57; 95% CI, 0.28-0.86), effectiveness (β = 4.63; 95% CI, 2.45-6.80), and patient satisfaction (β = 0.78; 95% CI, 0.54-1.02) compared with smaller PLHPs. After excluding Kaiser Permanente, PLHPs still had significantly higher star ratings (β = 0.22; 95% CI, 0.01-0.43), effectiveness (β = 1.22; 95% CI, 0.47-1.97), and patient satisfaction (β = 0.43; 95% CI, 0.12-0.73). Compared with nonprofit PLHPs, for-profit PLHPs had significantly lower star ratings (β = –1.07; 95% CI, –1.62 to –0.52), effectiveness (β = –5.59; 95% CI, –8.22 to –2.96), access (β = –7.66; 95% CI, –12.27 to –3.06), and patient satisfaction (β = –1.38; 95% CI, –2.70 to –0.06).

The eAppendix Figure reflects the significance and direction of differences between PLHPs and non-PLHPs for each outcome by MA region. PLHPs performed significantly better than non-PLHPs in the following number of regions and outcomes, respectively: 6 of 16 regions for effectiveness, 5 of 16 for procedure frequency, 4 of 16 for inpatient days, 3 of 16 for star ratings and inpatient discharges, 3 of 12 for patient satisfaction, and 2 of 16 for access and readmission probability. PLHPs performed significantly worse than non-PLHPs in the following number of regions and outcomes, respectively: 2 of 16 for access, inpatient days, and readmission probability; 1 of 16 for selected procedure frequency; 1 of 12 for patient satisfaction; and 0 of 16 for star ratings, effectiveness, and inpatient discharges. Although regions in which PLHPs performed better than non-PLHPs generally varied by outcome, PLHPs performed consistently better in most outcomes in regions that included Texas, Illinois, and Wisconsin (eAppendix Table 4).


The results of our analyses show that MA contracts offered by PLHPs are associated with greater quality and patient satisfaction and decreased procedures. We further found that the effects of PLHP vary by size, nonprofit status, and region, with larger and nonprofit PLHPs performing better than their smaller and for-profit counterparts, respectively.

Our results on healthcare quality and patient satisfaction are consistent with findings by Johnson et al and Lyon et al, who found that PLHP MA contracts were associated with superior performance in quality and satisfaction measures.1,6 Prior literature on the impact of vertical integration on utilization has shown mixed results, with some studies’ results suggesting decreased utilization3,5,7,8 and others’ suggesting increased utilization.10,12,13 Although few studies have investigated access, Lyon et al demonstrated decreased provider access among PLHPs.6 Comparison with prior studies is nevertheless challenging, as some failed to differentiate between vertical and other forms of integration (eg, horizontal integration through provider consolidation), did not distinguish between separate outcome domains, and lacked standardized, timely outcomes. Our results are an important and distinctive contribution to existing literature because we evaluated the association between PLHPs and separate but standardized outcomes that reflect quality, access, utilization, and patient satisfaction.

PLHPs’ higher quality and satisfaction performance could be due to multiple factors. PLHPs may leverage the strengths and resources of insurers and providers to achieve common goals of delivering high-quality patient care. Potential resources include enhanced coordination between insurers and providers, use of unified electronic health records, integration of initiatives focused on high-value care, and streamlined interactions between patients and providers. Importantly, the populations served by PLHPs were lower risk than non-PLHPs and had nonsignificant trends toward being wealthier, more educated, and with fewer minorities. Although we adjusted for these differences in multivariable models, it is possible that we observed superior quality outcomes among PLHPs due to demographic-based differences. Nevertheless, we observed no differences in access, inpatient days, discharges, or readmissions, thus identifying a need for PLHPs to streamline and optimize utilization.

We uniquely assessed how the effect of PLHPs differs with plan characteristics, identifying important differences in outcomes based on size, nonprofit status, and region. This suggests that not all PLHPs are alike, with the heterogeneity potentially being caused by multiple factors. First, plans could have differences in organizational commitment to their populations. For example, commitment could be stronger in regions that emphasize population health and among larger plans that bear increased risk. Second, the complex relationship between plan enrollment and quality is important to consider: On one hand, larger plans seem more established as PLHPs and, therefore, may have more experience with the model, resulting in higher quality and efficiency; on the other hand, large plan size could result from the plan itself being high quality, as higher-quality plans tend to have higher enrollment.18 Third, although we adjusted for several demographic-based covariates, it is possible that the effect of PLHPs varies by their populations’ needs. For example, populations in Texas, Wisconsin, and Illinois may benefit more from vertical integration due to varying clinical needs and demographics. Fourth, our findings that nonprofit PHLPs performed better than for-profit PLHPs are consistent with previous literature15 and suggest that population health approaches differ by profit status, potentially due to differences in underlying incentives. Fifth, there is heterogeneity in the extent of integration employed by PLHPs. For example, some PLHPs restrict providers and enrollees within their own PLHP systems, whereas in others, providers and enrollees are permitted to see non-PLHP patients and be seen by non-PLHP providers, respectively. As a result, PLHPs with less mutual exclusivity in respective provider and payer markets may have less alignment of payer–provider incentives compared with more restrictive plans. Finally, PLHPs may take advantage of varying aspects of integration in their initiatives.2 For instance, some PLHPs might have a more unified electronic health record than others, whereas other PLHPs might have more collaborative payer–provider care management programs.

Although prior studies identified characteristics associated with health plan and accountable care organization success,15,21,22 future research could investigate factors and approaches associated with successful PLHPs.


Given that this is a cross-sectional study, we can make no inferences regarding causality. It is possible that high-performing providers practice in PLHPs or that health-seeking patients enroll in such plans. Second, some plans did not have data available for certain outcomes. However, the likelihood of missing data did not significantly vary between PLHPs and non-PLHPs and is therefore unlikely to bias results (eAppendix Table 3). Third, although we adjusted for differences between PLHPs and non-PLHPs, our findings may still be subject to residual confounding due to unobserved effects. Fourth, we did not adjust for market competition, which can impact pricing, networks, and population health approaches. Finally, our outcomes are not independent because star ratings include components of HEDIS and CAHPS in their calculations. Nevertheless, assessing PLHP effects on these outcomes is valuable because star ratings represent important composite measures, whereas HEDIS and CAHPS scores inform which domains may drive star ratings differences.


As alternative payment models grow, momentum is building to integrate provision and payment of care through PLHPs. Our analysis of 2016 MA plans demonstrates the potential of such organizations to deliver high-quality care, although opportunities remain in optimizing utilization.
Print | AJMC Printing...