Tracking Spending Among Commercially Insured Beneficiaries Using a Distributed Data Model

August 21, 2014

The authors demonstrate the utility of distributed data models for reporting of local trends and variation in utilization, pricing, and spending for commercially insured beneficiaries.


To explore the feasibility of using a distributed data model for ongoing reporting of local healthcare spending, specifically to investigate the contribution of utilization and pricing to geographic variation and trends in reimbursements for commercially insured beneficiaries younger than 65 years.

Study Design

Retrospective descriptive analysis.


Commercial claims were obtained for beneficiaries in 5 states for the years 2008 to 2010 using a distributed data model. Claims were aggregated to the hospital service area (HSA) level and healthcare utilization was quantified using a novel, National Quality Forum—endorsed measure that is independent of price and allows for the calculation of resource use across all services in standardized units. We examined trends in utilization, prices, and reimbursements over time. To examine geographic variation, we mapped resource use by HSA in the 3 states from which we had data from multiple insurers. We calculated the correlation between commercial and Medicare reimbursements and utilization. Medicare claims were obtained from the Dartmouth Atlas.


We found that much of the recent growth in reimbursements for the commercially insured from 2008 to 2010 was due to increases in prices, particularly for outpatient services. As in the Medicare population, resource use by this population varied by HSA. While overall resource use patterns in the commercially insured did not mirror those among Medicare beneficiaries, we observed a strong correlation in inpatient hospital use.


This research demonstrates the feasibility and value of public reporting of standardized area-level utilization and price data using a distributed data model to understand variation and trends in reimbursements.

Am J Manag Care. 2014;20(8):650-657

This study explored the use of a distributed data model and a novel method for calculating utilization to track local trends in spending for commercially insured beneficiaries.

  • Recent growth in spending for commercially insured beneficiaries was due principally to increases in prices, rather than increases in utilization.
  • Commercial utilization and spending varied across local areas and was not highly correlated with Medicare utilization and spending.
  • A distributed model may allow for nationwide reporting of spending and utilization to track the local effects of healthcare reform efforts.

Timely, local data are important to policy-makers, providers, patients, payers, and employers working to slow the growth of healthcare spending, which is a major focus of federal, state, and local healthcare reform initiatives. Community-based multistakeholder coalitions have formed across the country in an effort to influence their local healthcare markets and reduce costs. More than 40 percent of people in the United States live in a community with a multistakeholder coalition aimed at improving health and healthcare, including collaboratives focused on improving the exchange of health information, accelerating engagement by key local opinion leaders and stakeholders, or promoting quality improvement.1 All of these entities, however, lack the local data needed to determine if their efforts are making a difference.

The factors contributing to rising healthcare spending differ across communities and depend on local context; understanding the drivers of local spending growth is complicated by the variety of inputs. Provider culture and supply, various market segments (outpatient, inpatient, long-term care), payer mix, regulation, and the competitiveness of hospital and physician markets all affect pricing, utilization, and ultimately, the total cost of care.

Research has shown that the relative contribution of these factors varies across markets and that drivers of commercial spending are not necessarily the same as drivers of Medicare spending. Chernew et al found that commercial spending was not correlated with Medicare spending across hospital referral regions.2 Examining the commercial and Medicare populations for El Paso and McAllen, Texas, Franzini et al found there was 86% greater per capita Medicare spending in McAllen than El Paso, but 7 percent less commercial spending.3 A recent Institute of Medicine report found that regional variation in spending in the commercial insurance market is due in large part to differences in price markups by providers. Differences in utilization, however, still explained over 30% of the regional variation in spending.4

These diverse results indicate that the factors that contribute to rising healthcare spending are multifactorial and highlight the need for more comprehensive and detailed analysis of commercial data. The 2013 report issued by the Institute of Medicine, Variation in Health Care Spending: Target Decision Making, Not Geography, was unprecedented in that it combined both Medicare and commercial data from a variety of proprietary sources for analysis. The report underscored the importance of available data on both commercial and Medicare utilization and prices at a local level. These data could be useful to engage local stakeholders (such as employers), identify areas of potential waste, point regulators to localities where monopolistic pricing may be occurring, and evaluate the impact of local and national reform initiatives.4

There is substantial uncertainty about the best approach to make local data publicly available. Feasible options might include a single federal database,5 combining state all-payer data (currently available in Colorado, Kansas, Maine, Maryland, Massachusetts, Minnesota, New Hampshire, Tennessee, Utah, and Vermont),6 using private data aggregators, or relying on a distributed data model.7 The Health Care Cost Institute has combined data from 4 large commercial insurers and has published broad reports on trends in employer-sponsored insurance, but has not produced any data at a local level.8,9 There are some community-level efforts under way to track healthcare spending and utilization locally, most notably in California, but they are relatively rare.10 In spite of the Institute of Medicine’s call for making more and better data available on both Medicare and commercial populations, it is not clear how this might be done.

In this paper we explore the feasibility and value of 2 potential approaches for making commercial spending and utilization data available at a local level. First, we work with an all-payer data set for New Hampshire, Maine, and Vermont, aggregated by Onpoint Health Data, a private, nonprofit organization. Second, we use a distributed data model to aggregate data from a single payer in 2 states. Our methods include a standardized approach to measuring utilization, and our partners submitted utilization and spending reports that were stripped of protected health information. These reports allowed us to aggregate data at a hospital service area (HSA) level and adjust for demographic characteristics. The findings demonstrate the feasibility of each of these approaches and the resulting data highlight the potential utility of tracking local healthcare spending.


To test the distributed data model, we used data from 3 distinct sources. We obtained data on utilization and reimbursements for commercial beneficiaries in Maine, New Hampshire, and Vermont from an all-payer data set (2008-2010), along with data on beneficiaries of Blue Cross Blue Shield of Michigan (2009-2010) and Texas (2008- 2010) plans. The latter 2 states were chosen because of existing research relationships with relevant data providers. Each of the participating analytic teams applied standard software to measure utilization and submitted reports detailing summary information for beneficiaries in each age and gender category for each HSA. HSAs were defined in previous work by identifying the zip codes where the highest proportion of Medicare beneficiaries received their care from a single hospital.11 Data were then aggregated by the study authors and compared with corresponding HSA data from the Medicare program.

Study Populations

Each participating organization applied a standardized approach for defining the commercial population to be included in the analysis. Beneficiaries were required to be enrolled for a minimum of 9 months during the year, unless the member was born during the calendar year or was older than 65 years by the end of the year.

Data Management

Participating health plans and data providers used software developed by HealthPartners, a nonprofit healthcare organization and health plan, to measure a utilization amount associated with each insurance claim. The HealthPartners algorithm quantifies utilization using Total Care Relative Resource Values (TCRRVs), which are the basis for the National Quality Forum (NQF)-endorsed Total Resource Use measure.12 TCRRVs, which are expressed in dollars, measure the utilization and intensity of the services delivered to manage a patient’s healthcare needs. We chose to use the HealthPartners measure because it is independent of price and allows for the weighting and calculation of resource use across all medical services in standardized units. We did not want to obscure the effect of regional price differences on overall reimbursement amounts. In addition, we chose to use the measure because it makes resource use amounts equivalent for services offered across multiple settings.

TCRRVs are unique in that they are relative within and across components of care (inpatient, outpatient, professional, and pharmacy), which allows for the isolation of resource use not only by component, but also on a total, per capita basis. The TCRRVs are made relative within the components of care using the CMS weighting system, including Medicare Severity Diagnosis Related Groups for inpatient care, Ambulatory Payment Classifications for outpatient care, and Relative Value Units for professional office care, while the TCRRVs for pharmaceutical expenditures are made relative by using the median average wholesale price per day for each National Drug Code. TCRRVs are calculated by multiplying the CMS weight by national average paid amount and the relevant service count. While the Health Care Cost Institute also uses CMS weights to measure the intensity of care, its methodology does not involve reporting (1) a resource use amount based on both utilization and intensity, or (2) a measure of total resource use across service lines.

The TCRRV algorithm was applied at the claim level by each data contributor and aggregated at the HSA level for transmission to study authors. (TCRRV weights are updated annually to correspond to updates in the CMS weight files. We used the 2011 TCRRV national weights.) Resource use was capped at $100,000 for each beneficiary in a given year and all components of care were reduced proportionally for both reimbursements and resource use. Very large resource use amounts greatly skew the mean in populations where the majority of beneficiaries have very low utilization and often represent extreme, unavoidable events. Capping reduced average resource use in 2008-2010 by an average of 11% across all HSAs (9.3% and 13.6% for HSAs in the lowest and highest quintile of resource use, respectively).

Both reimbursement and resource use (ie, utilization) amounts for each component of care (inpatient, outpatient, professional, and pharmacy) were transmitted to the study authors, as were denominator data (age in groups, gender, HSA of residence, number of people, and number of person-years). The HealthPartners software performs quality control checks at the claim level by comparing calculated resource use with the reimbursement amount. If the calculated resource use for a particular claim was outside of normal limits, we imputed resource use using the reimbursement amount multiplied by the ratio between resource use and reimbursement amount for all normal claims from that state, year, and component of care. The combination of numerator and denominator data allowed for adjustment and rate calculations. The transmitted data also included the prevalence of prescription drug and mental health carve-outs for each HSA. The Blue Cross Blue Shield of Michigan data did not include HSAspecific information on carve-out levels, but did include the overall state values by year. We used this information and exploratory models from the other states to impute prescription drug reimbursement and utilization values for Michigan HSAs. The imputation assumed a constant carve-out rate across the state each year.

Statistical Methods

Age, gender, and carve-out adjustment was performed using linear regression across the 5 states. Weighted averages for each HSA were calculated using the adjusted values for each year and component of care. State averages were created by weighting by the number of beneficiaries in each HSA. HSA average relative prices were defined as the ratio of resource use to the reimbursement amount. We normalized relative prices to 1 across all 5 states. Only HSAs with a sum of more than 1000 commercial beneficiaries over the study period were included. Analysis was also limited to those HSAs that were either in 1 of the 5 target states or overlapped the border of 1 of the 5 target states. These 2 exclusion factors reduced the total number of people per calendar year in the analyses by less than 0.1%.

To examine geographic variation, we mapped resource use for each HSA in Maine, New Hampshire, and Vermont, reflecting procedures and services offered across all 4 components of care. We divided the HSAs into quintiles to display differences in resource use. Maps were not generated for Michigan or Texas, because data were from a single insurer in each state.

Medicare Program Comparison

To further validate our methodology by attempting to replicate previous findings by examining the association between Medicare and commercial reimbursements and utilization. Age-, sex-, and race-adjusted reimburse- ment and utilization data at the HSA level for the feefor- service Medicare program were downloaded from the Dartmouth Atlas website.13 Utilization data were adjusted for prices across areas and settings.8 Using these data, we examined the association between calculated resource use and reimbursements for the commercially insured and Medicare (aged ≥65 years) populations by calculating a correlation coefficient and plotting the relationship, weighting the regression and the size of the data point by the sum of the number of commercial and Medicare beneficiaries in the HSA.


Characteristics of the Study Cohort

We identified 388 HSAs and a total of 16,819,237 beneficiary years for inclusion in our analysis (Table 1). The average number of commercial beneficiaries across 2008- 2010 included in the data set for each HSA ranged from 337 to 296,555. The majority of the beneficiaries in each of the 5 states were female. The age distribution was similar across the states, with about half of the population under 40 years and half aged 40 to 65 years. The only exception was Texas, where the population was slightly younger (a greater proportion in the 18-to-39-year age group). The age distributions for our study cohort were similar to those reported by the United States Census Bureau for privately insured individuals in each state.14

Trends in Commercial Beneficiary Resource Use, Relative Prices, and Reimbursements

We examined trends in resource use and relative prices to investigate the extent to which changes in those 2 factors affected changes in reimbursements. The top portion of Figure 1 shows the year-to-year resource use trend per beneficiary by component of care and state. Profes- sional services constituted the highest proportion (42.9%) of resource use on average across the 5 states, followed by outpatient services (21.8%), inpatient services (20.0%), and pharmacy (15.1%). Annual resource use ranged from $4479 per commercial beneficiary in Vermont in 2008 to $5324 per beneficiary in Texas in 2009. The bottom portion of the figure shows calculated changes in resource use, prices, and reimbursements from year to year by component of care. Overall increases in reimbursements were greater in 2008- 2009 than 2009-2010. Reimbursements increased each year in each state, with the exception of Maine in 2010. Total resource use went down in each state between 2009 and 2010, mostly due to declines in inpatient (hospital) services. Increases in prices were larger than the decreases in resource use, however, so reimbursements still increased. The changes in reimbursements in each state can also be compared with changes in the consumer price index (CPI) overall and for medical care. The CPI decreased 0.4% between 2008 and 2009 and increased 1.6% between 2009 and 2010. The medical care CPI increased 3.2% between 2008 and 2009 and 3.4% between 2009 and 2010.15 Overall, growth in prices for commercial beneficiaries in our data set exceeded overall inflation in each state and year. In all but 2 cases, our measure of price increases also exceeded medical inflation.

Variation in Commercial Beneficiary Resource Use, Relative Prices, and Reimbursements

Resource use, relative prices, and reimbursement amounts varied among commercial beneficiaries across HSAs within our 5 states, and across states (Table 2). Resource use among commercial beneficiaries ranged from $3041 to $6280 across HSAs in the 5 states, which compares to a range of $6322 to $15,049 among Medicare beneficiaries. There was less variation in the commercial population (coefficient of variation = 0.10) than the Medicare population (coefficient of variation = 0.17). Within-state variation in resource use was most marked in Michigan. Relative prices ranged from 0.83 to 1.35 across HSAs in the 5 states, and reimbursements ranged from $1938 to $4626. Relative prices under 1 indicate that, for a given HSA, prices were below the average across all HSAs in the 5 states. The correlation between prices and resource use across HSAs was low (P = .03).

Variation in resource use is further illustrated in Figure 2, which maps the ratio of resource use in each New England HSA to the weighted average of resource use across the region. Those HSAs with higher resource use relative to the regional average are darker in color. The HSAs with the highest utilization in this area were Rochester, New Hampshire; Ellsworth, Maine; and Augusta, Maine. The HSAs with the lowest utilization in this area were Rumford, Maine; Colebrook, New Hampshire; and St. Johnsbury, Vermont.

Association of Commercial and Medicare Reimbursements and Utilization

Overall resource use in the commercial population was not correlated with resource use in the Medicare population at the HSA level across these 5 states (P = .10, Figure 3), but the correlation across HSAs for inpatient resource use was higher (P = .42). A negative correlation was observed for overall commercial and Medicare reimbursements (P = —.36) and a low correlation for inpatient reimbursements (P = .10).


This research combined data from 2 private payers and an all-payer data set to produce standardized estimates of utilization, pricing, and reimbursements at the local level across 5 states using a distributed data model. We were able to calculate HSA-level trends and variation for the commercially insured population aged <65 years in the select states using a standardized, NQF-endorsed measure of utilization. Our findings revealed marked variation in both relative prices and utilization, consistent with work by Dunn et al and Baker et al, and confirmed that commercial trends do not necessarily mirror trends in the Medicare population, as previously found by Chernew et al and the Institute of Medicine.2,4,16,17

The goal of this research is to help communities understand local healthcare spending patterns and to begin to develop tailored solutions to slow growth in healthcare spending. This line of research differs from previous attempts to examine healthcare costs within the commercially insured population in several significant ways. First, it uses a validated, publicly available measure of utilization to calculate and standardize the magnitude of healthcare resource use across various commercially insured populations and service lines (outpatient, inpatient, professional, and pharmacy), taking into account both the frequency and intensity of care. This measure is independent of price and thus can be used to compare healthcare utilization for different health plans in different parts of the country. Second, we conducted our analysis at the level of the HSA, a relatively small unit that describes the geographic area where patients served by local hospitals reside. Finally, by using a distributed data model, this research demonstrates the feasibility of an approach that could be used either by local multistakeholder initiatives or by a national effort to report on local utilization and spending.

The main limitation of this study stems from it being pilot research: the number of payers included in our analn- ysis for each state varied significantly (61 in Maine, 37 in New Hampshire, 35 in Vermont, and 1 each in Michigan and Texas), along with the insurance products offered by those payers, which reduces comparability across states. The use of a distributed data model presents its own set of potential limitations, since no part of the team had access to claim-level data for all of the states. To address this issue, we created quality control reports to ensure that outputs were accurate and standardized across sites.

Another limitation was our reliance on risk adjustment using solely age and gender indicators, rather than risk adjustment that included other demographic characteristics (income, education, etc) or information on medical conditions. Diagnosis-related risk adjustment has practical implications and may introduce additional biases18,19 we sought to demonstrate the viability of the distributed data model before compounding the complexity and cost of data aggregation by implementing a risk adjustment methodology. In future research, we plan to compare results using various risk adjustment methods. Lastly, a potential limitation to this study was our use of HSAs as the unit of analysis. HSAs were defined by Medicare utilization patterns in the 1990s and may not fully capture utilization patterns among commercial beneficiaries.11

Key factors contributing to the success of our pilot included the generation of quality control reports for data providers and the data aggregator to ensure data accuracy, as well as the use of mock commercial claims to test aggregation algorithms. While it is possible to apply our methodology using other levels of analysis, we continue to believe that HSAs represent the best option as they, unlike counties, reflect actual patterns of healthcare use, and likely provide more actionable information for community- level interventions than hospital referral regions.

Routine reporting of timely data at a local level can provide actionable information to communities hoping to understand the sources of healthcare spending and growth. This information will be useful for many purposes: to engage employers in reducing healthcare costs, encourage accountability for prices and utilization among providers, encourage community-level shared savings programs (such as the Akron Accountable Care Collaborative20), and facilitate research and evaluation of regional and national reforms. In addition, as consolidation in physician and hospital markets increases, these data can support transparency among health systems and aid regulators in monitoring the effects of changes in payment incentives that might drive consolidation, such as accountable care organizations or bundled payment reforms.

This research provides a unique avenue to develop a nationwide, comprehensive commercial data set in which payers create measures of cost and utilization from their own claims using standard practices, with controls for data quality and consistency. In addition, it allows payers to avoid releasing protected health information (in which patients may be identified) or pricing data, which is commercially valuable and possibly subject to antitrust concerns. The goal is for both large and small payers to contribute to an all-inclusive national data set for public reporting. The barriers to this approach are not trivial: developing it requires the cooperation and effort of multiple competitive payers, as well as entrusting the protection of pricing information to a third party.


The authors are indebted to Sarah Kler of The Dartmouth Institute for Health Policy and Clinical Practice for assistance with mapping, as well as Greg Kotzbauer and Andrew Toler of The Dartmouth Institute, Robyn Rontal of Blue Cross Blue Shield of Michigan, and Marianne Udow-Phillips of the Center for Healthcare Research and Transformation for their insights and advice.

In addition, the authors would like to acknowledge the following entities as the original sources for the New Hampshire and Vermont data sets: the New Hampshire Department of Health and Human Services; the New Hampshire Insurance Department; the Vermont Department of Financial Regulation (formerly the Vermont Department of Banking, Insurance, Securities, and Health Care Administration); and the Green Mountain Care Board. The analyses and conclusions drawn from these data do not necessarily represent those of any agency of the state of New Hampshire or state of Vermont.

Author Affiliations: The Dartmouth Institute for Health Policy and Clinical Practice (CHC, WLS, DJG, ABM, ESF), the Department of Medicine (ESF), and the Department of Community and Family Medicine (ESF), Geisel School of Medicine at Dartmouth, Lebanon, NH; Blue Cross Blue Shield of Michigan, Detroit, (PGA); Center for Healthcare Research and Transformation, Ann Arbor, MI (NB); Onpoint Health Data, Portland, ME (KF, RS); Division of Management, Policy and Community Health, University of Texas School of Public Health, Houston (LF, RP); and HealthPartners, Inc, Bloomington, MN (GK, SK). Funding Source: This work was funded by grant 69088 from the Robert Wood Johnson Foundation.

Author Disclosures: The authors report no relationship or financial interest with any entity that would pose a conflict of interest with the subject matter of this article.

Authorship Information: Concept and design (CHC, DJG, ABM, GK, SK, ESF); acquisition of data (CHC, DJG, ABM, PGA, NB, KF, LF, GK, SK, RS, ESF); analysis and interpretation of data (CHC, WLS, DJG, PGA, KF, LF, GK, SK, RP, RS); drafting of the manuscript (CHC, WLS, ABM, GK, SK, ESF); critical revision of the manuscript for important intellectual content (CHC, WLS, DJG, ABM, NB, LF, GK, SK, ESF); statistical analysis (CHC, WLS, GK, SK, RP); provision of study materials or patients (CHC); obtaining funding (CHC, ESF); administrative, technical, or logistic support (CHC, WLS, DJG, PGA, KF, GK, SK, RP, RS); and supervision (CHC, ESF).

Address correspondence to: Carrie H. Colla, PhD, 35 Centerra Pkwy, Lebanon, NH 03766. E-mail:

1. Where regional health improvement collaboratives are located. Published 2012. Accessed August 7, 2013.

2. Chernew ME, Sabik LM, Chandra A, Gibson TB, Newhouse JP. Geographic correlation between large-firm commercial spending and Medicare spending. Am J Manag Care. 2010;16(2):131-138.

3. Franzini L, Mikhail OI, Skinner JS. McAllen And El Paso revisited: Medicare variations not always reflected in the under-sixty-five population. Health Aff (Millwood). Dec 2010;29(12):2302-2309.

4. IOM Committee on Geographic Variation in Health Care Spending and Promotion of High-Value Care. Report: Variation in Health Care Spending: Target Decision Making, Not Geography. Washington, DC: Institute of Medicine of the National Academies; 2013.

5. Chappel A. Multi-Payer Claims Database (MPCD) for Comparative Effectiveness Research. Presentation to the NCVHS Full Committee, June 16, 2011. Accessed August 7, 2013.

6. All-Payers Claims Database Council. Interactive State Report Map. APCD Council website; Updated 2013. Accessed July 15, 2013.

7. Quality Alliance Steering Committee. Homepage. QASC website. Updated 2013. Accessed August 6, 2013.

8. Gottlieb DJ, Zhou W, Song Y, Andrews KG, Skinner JS, Sutherland JM. Prices don’t drive regional Medicare spending variations. Health Aff (Millwood). 2010;29(3):537-543.

9. Herrera CN, Gaynor M, Newman D, Town RJ, Parente ST. Trends underlying employer-sponsored health insurance growth for Americans younger than age sixty-five. Health Aff (Millwood). 2013;32(10):1715-1722.

10. Rusin G. Total Cost of Vare and Value Based P4P. Presentation of Integrated Healthcare Association, March 19, 2012. Accessed October 28, 2013.

11. Appendix on the geography of health care in the United States. In: The Dartmouth Atlas of Health Care.. Lebanon, NH: Dartmouth College; 1999. 289-307.

12. HealthPartners. Total Cost of Care Population-based PMPM Index. National Quality Forum website. 1604. Published January 31, 2012. Accessed June 2013.

13. The Dartmouth Atlas Working Group. The Dartmouth Atlas of Health Care. Hanover, NH: The Dartmouth Institute for Health Policy & Clinical Practice; 201;. Accessed June 2013.

14. American Community Survey 2012 Data Release. U.S. Census Bureau. Accessed June 2013.

15. Consumer Price Index. Bureau of Labor Statistics. Accessed June 2013.

16. Dunn A, Shapiro AH, Liebman E. Geographic variation in commercial medical-care expenditures: a framework for decomposing price and utilization. J Health Econ. 2013;32(6):1153-1165.

17. Baker LC, Fisher ES, Wennberg JE. Variations in hospital resource use for Medicare and privately insured populations in California. Health Aff (Millwood). 2008;27(2):w123-134.

18. Song Y, Skinner J, Bynum J, Sutherland J, Wennberg JE, Fisher ES. Regional variations in diagnostic practices. N Engl J Med. 2010;363(1):45-53.

19. Wennberg JE, Staiger DO, Sharp SM, et al. Observational intensity bias associated with illness adjustment: cross sectional analysis of insurance claims. Br Med J. 2013;346:f549.

20. Austen BioInnovation Institute’s Accountable Care Community Gains National Support to Serve Health Needs in Akron, Summit County.’s- accountable-care-community-gains-national-support-servehealth. Published September 27, 2011. Accessed August 6, 2013.

This research demonstrates the feasibility of 2 broad approaches for overcoming these barriers. In future work, the same approach could be used to report claims-based quality measures, ensuring public access to local data on both cost and quality.

Tracking Spending Among Commercially Insured Beneficiaries Using a Distributed Data Model