Statewide Data Infrastructure Supports Population Health Management: Diabetes Case Study

Craig Jones, MD; Mary Kate Mohlman, PhD; David Jorgenson, MS; Karl Finison, MA; Katie McGee, MS; and Hans Kastensmith

States and healthcare organizations across the country are testing strategies intended to control the growth of healthcare costs while improving care quality and health outcomes. To achieve these goals, delivery systems must address the medical, social, and behavioral needs of populations with high-cost, complex conditions, particularly as upstream preventive services could preclude expensive complications.1,2 A blend of population-level data and innovative service models should be used to engage patients, address risk factors for poor outcomes, improve management of chronic disease, and reduce unnecessary acute care expenditures.3-7

Increasingly, Vermont is producing population-level data to support a learning health system, better care, and more informed oversight.8 The state’s data infrastructure includes an all-payer claims database (APCD) containing data from major commercial insurers, Medicaid, and Medicare. It also includes a statewide health information exchange network (HIEN) funded by the state of Vermont and developed and managed by the Vermont Information Technology Leaders (VITL), an independent nonprofit organization established in statute.9 One of the roles of the HIEN is to populate a state-maintained clinical registry with data from electronic health record (EHR) systems used in primary care medical homes and hospitals across the state. Every 6 months, extracts from the claims and clinical data systems are linked at the patient level to evaluate comparative performance in expenditures, utilization, quality, and clinical outcomes.10,11

This study examined how linked claims and clinical data from Vermont’s statewide data systems could inform a population care and health management model for high-cost, complex patients with common chronic conditions. The 3 objectives were: 1) developing a model to identify subpopulations through a set of selection criteria (ie, modifiable clinical risk factors and manageable comorbid conditions) regularly tracked in medical records for which primary care providers could improve preventive care, 2) estimating predictable impacts on healthcare costs upon achieving model goals within a 1- to 2-year period, and 3) setting care goals based on established guidelines and available treatments for selection criteria.

Individuals with diabetes in Vermont were selected as a subpopulation due to the disease’s prevalence, overlap with other common comorbid conditions, and impact on health and healthcare costs.12,13 Furthermore, the disease course of diabetes can be modified by addressing routinely tracked risk factors, such as glycemic control, blood pressure, and body mass index (BMI), and by improving control of common comorbid conditions with available treatments.14-17 In developing the model for a population with diabetes, this study demonstrated how statewide health data and information infrastructure could be used to identify patient subpopulations for targeted outreach and panel management and to improve proactive, preventive care. 



To achieve the objectives, the study evaluated whether glycemic control was associated with same-year expenditures. It also examined which clinical risk factors and comorbid conditions had strong associations with same-year expenditures and could serve as selection criteria for outreach. Finally, the study determined the rate of hospital admissions due to ambulatory care sensitive conditions (ACSCs)18 for those with the selection criteria and calculated the financial impact of reducing admissions through better care.

Data Sources and Study Population

Vermont’s APCD, which includes commercial, Medicaid, and Medicare eligibility and medical and pharmacy claims data for residents of Vermont, served as the study’s source for claims data. Detailed descriptions of this database have been published previously.19,20 A statewide clinical data registry provided clinical data from practice and hospital EHRs and included height, weight, blood pressure, and results from glycated hemoglobin (A1C) tests. Through a linkage process, claims data was paired with corresponding clinical data at the individual level.

The study population consisted of patients with diabetes identified by claims from January 1, 2014, to December 31, 2014, according to the Healthcare Effectiveness Data and Information Set specifications outlined in the Comprehensive Diabetes Care measure.21 These specifications included members aged 18 to 75 years with continuous enrollment during the study year and at least 1 acute inpatient visit or 2 outpatient visits indicating diabetes or who were dispensed insulin, hypoglycemics, or antihyperglycemics in the year prior to or during the study year. The study population was further limited to those whose claims data could be linked to clinical measures. 

Risk Factor and Outcome Measures

Risk factors and comorbidities in claims data included age, gender, disability status (Medicare), insulin use, and ACSCs, such as chronic obstructive pulmonary disease (COPD), asthma, congestive heart failure (CHF), coronary heart disease (CHD), depression, and renal failure. Clinical risk factors included BMI, blood pressure (diastolic and systolic), and A1C test results. Preliminary analysis indicated little variation in expenditures among BMI levels ≥18.5 and <35 (the “normal,” “overweight,” and “obese (class 1)” BMI categories).22 Thirty-five members classified as underweight (ie, BMI <18.5) were excluded from the analysis because confounding attributes, such as a high prevalence of renal failure, resulted in expenditure and admission trends inconsistent with those with <35 BMI. Therefore, members with BMI ≥35 (severe obesity classes 2 and 3) were compared with those with a BMI ≥18.5 or <35. Members with blood pressure not in control were grouped into 3 categories: high (systolic ≥140 mm Hg or diastolic ≥90 mm Hg), low (systolic ≤90 mm Hg or diastolic ≤60 mm Hg), and discordant (systolic ≥140 mm Hg and diastolic ≤60 mm Hg). Members with A1C not in control were classified as having high A1C (>9%). Other A1C categories included mid-range (>6% but ≤9%) and low (≤6.0%). The low-A1C group was identified through clinical data, which showed high costs associated with low A1C values, as seen in Figure 1. Findings in the literature indicated elevated risks among patients with diabetes with lower A1C and comorbidities.23,24 

Outcome measures for this study came from the claims data. Total expenditures encompassed allowed amounts on claims, including amounts paid by the insurer and member (eg, coinsurance, deductible, co-pays) for services in all settings (inpatient, outpatient facility, professional, ambulance, and pharmacy) and for durable medical equipment. Although Vermont Medicaid covers special nonmedical services targeted at meeting social, economic, and rehabilitative needs (eg, transportation, home- and community-based services, case management, dental services, residential treatment, mental health facilities, and school-based services), claims for these services were excluded to maintain consistency with services covered by other payers.

To analyze potential opportunities to reduce utilization and cost, acute inpatient hospitalizations and associated expenditures were measured for conditions listed under the Prevention Quality Indicators (PQIs) from the Agency for Healthcare Research and Quality, which included admissions for diabetes short-term complications, perforated appendix, diabetes long-term complications, COPD, asthma, hypertension, heart failure, dehydration, bacterial pneumonia, urinary tract infection, angina without procedure, uncontrolled diabetes, asthma, and lower extremity amputations.18 When guideline-based outpatient and preventive care is provided, hospitalization for these conditions is reduced.25 

Analytical Methods

Because the study population was limited to patients with diabetes with claims linked to clinical data, we compared the demographic characteristics of this group against that of patients with diabetes with only claims data to review whether limiting the sample population to those with linked data created a selection bias. Pearson’s χ2 test was used to compare the 2 groups.

Multivariable linear regression was used to determine the relative impact of each risk factor on expenditures. To reduce the impact of extreme outlier cases and to correct for skew in the expenditures, a log transformation was applied. For predictor variables, claims data provided information about demographics, payer type, and prevalence of comorbidities. The clinical data provided the most recent blood pressure, BMI, and A1C records. Using the full set of variables, stepwise selection identified which predictor variables to include in the regression model. 

The coefficients from the final regression model were used to calculate the relative effect on costs associated with each risk factor. Relative effects indicated a proportional change from the baseline category to the at-risk category while controlling for all other variables in the analysis. The relationship between the sum of the total costs of a population associated with a particular risk factor and the inverse of the relative effect (ie, the relative effect if the same population did not have the risk factor or condition) was used to estimate the additional cost associated with having the particular risk factor compared with not having the risk factor while controlling for other factors. This relationship is reflected in the below expression where x refers to the target risk factor: 
∑ Costsx − ∑ Costsx
Relative Effectx

A Poisson regression used the same predictor variables identified in the cost model with inpatient admissions as the response variable to estimate the relative effect of the risk factors on inpatient admissions. 

Potential savings from a reduction in ACSCs were determined using a 20% and 50% reduction in these hospitalizations, assuming the mean cost per hospitalization for eliminated hospitalizations was the same as the overall mean cost per hospitalization. These reductions were calculated for the entire study population and independently for each risk factor subpopulation. 
All statistical analysis was done with SAS version 9.3 (SAS Institute Inc; Cary, North Carolina). 


Association Between Glycemic Control and Expenditures

Between January 1, 2014, and December 31, 2014, claims data identified 283,153 individuals attributed to patient-centered medical homes in Vermont. Of these, 19,000 (6.7%) were categorized as having diabetes. Of this population, 6719 (35.4%) had clinical data from the same time period that could be linked to claims data. Table 1 shows the demographic characteristics of the patients with diabetes with linked data and those without linked data. The 6719 individuals with clinical data were similar to the unlinked population according to proportions seen in demographic characteristics and 3M clinical risk groups (CRGs) (Table 1). Although P values indicate statistically significant differences, this finding is likely due to large sample sizes that can exaggerate minor differences. 

During this 1-year period, per capita total healthcare expenditures for patients with diabetes varied substantially but had little correlation with control of diabetes as measured by the most recent A1C result (“not insulin dependent” R2 value = <0.001; “insulin dependent” R2 value = 0.01) (Figure 1). Overall, total medical expenditures averaged $14,948 per patient, with an average of $16,644 for patients with A1C ≤6%; $14,230 for patients with A1C >6% and ≤9%; and an average of $16,484 for patients with A1C >9%. Notably, 23% of the population with A1C >6% and ≤9% had expenditures in the highest quartile, whereas 19% of the population with A1C >9% had expenditures in the lowest quartile. There was similar variation for those on insulin therapy: 25% of patients with A1C >6% and ≤9% had expenditures in the highest quartile of this subpopulation and 32% with A1C >9% had expenditures in the lowest quartile. 

Diabetes and Comorbidities

With the poor correlation between same-year expenditures and A1C results, common comorbid conditions and clinical risk factors (ie, blood pressure, BMI, A1C, insulin dependence, asthma, COPD, CHF, CHD, renal failure, and depression) became the next focus of analysis. The relative influence on per capita annual expenditures and inpatient hospital admissions is shown for patients with diabetes with each characteristic, controlling for the other factors (Figures 2 and 3). Comorbidities that had the largest relative impact on expenditures included, in descending order, renal failure, CHF, insulin dependence, COPD, and discordant blood pressure. Comorbidities that had the largest relative impact on inpatient hospital admissions were, in descending order, CHF, renal failure, discordant blood pressure, and COPD. 

Analysis showed that the total financial impact each characteristic had depended on the size of the cohort with that characteristic. For example, A1C >9% was associated with a 16% relative increase in annual expenditures per patient, which, when aggregated across the 726 patients identified as having poor control of A1C, resulted in a total cost of $1,560,931. By comparison, CHF was associated with fewer patients (79) but with a 144% relative increase in per patient expenditures, yielding a growth in aggregate expenditures of $2,102,368.

Inpatient hospital admissions were the largest contributor to total annual expenditures, and each of the subpopulations identified by the selection criteria had hospital admissions due to ACSCs. During the 12-month study period, the population with diabetes accounted for 1384 hospital admissions, with 341 (24.6%) due to ACSCs. Based on outcomes from PQI measures, each subpopulation had hospital admissions for ACSCs that were related to the selection criteria, as well as for ACSCs that were not directly related to the selection criteria (data not shown). For example, the 233 patients identified as having diabetes and COPD had a total of 118 hospital admissions, with 56 (47.4%) due to ACSCs, 34 (60.7%) of which were due to respiratory complications. In another example, the 726 patients identified as having diabetes and A1C >9% had a total of 187 hospital admissions, with 68 (36.3%) due to ACSCs, 38 (55.9%) of which were recorded as directly related to poorly controlled diabetes. 

Table 2 provides details on overall hospital admissions, admissions due to ACSCs, admission expenditures, and estimated savings if ACSC hospital admissions were reduced by 20% and 50%. For several subpopulations, the proportion of hospital admissions due to ACSCs was higher than the overall average of 25%, including 47% of 118 admissions for patients with COPD, 42% of 81 admissions for patients with CHF, 36% of 187 admissions for patients with A1C >9%, 31% of 206 admissions for patients with blood pressure <90/60 mm Hg, and 29% of 189 admissions for patients with renal failure.


The results of this study demonstrate the power of combining traditionally separate statewide clinical and multipayer claims data. Specifically, they demonstrate how this combined dataset can be used to meet the study’s objectives, namely identifying subpopulations for outreach through practical selection criteria, determining the relative contribution of known risk factors to actual expenditures, and setting attainable care goals to reduce preventable admissions.26 Although individual organizations may have analytic capabilities to support similar modeling for their own populations, a statewide multipayer data infrastructure allows for the identification of health patterns with significant associations not readily apparent in smaller populations. This type of information becomes particularly important as Vermont’s independent practices, hospitals, and health centers work toward an accountable care framework with shared interests to control costs and improve quality.1,27 In this context, the study was designed to generate results that support proactive rather than reactive care by the patient-centered medical homes, specialists, and other community providers that are increasingly working together to improve coordination, quality, and population health outcomes.20

Diabetes provided a useful test case of a complex condition that impacts 6.7% of the medical home population aged between 18 and 75 years in Vermont. Furthermore, it has a disease course modifiable through guideline-based management that addresses risk factors such as glycemic control, blood pressure, lipid control levels, diet, and obesity.28 Although A1C level is an indicator of sustained glycemic control and long-term complications and has a positive association with health costs over time,14,29-36 this study suggests a more complex relationship between glycemic control and total healthcare expenditures in the same year.37 Figure 1 shows one-fifth of patients with poorly controlled diabetes had total healthcare costs in the lowest quartile, whereas almost a quarter of those with an A1C level in the recommended range had annual healthcare expenditures in the top quartile. Although long-term glycemic control can reduce complications and potentially avoid some healthcare costs,15,37 this study highlights the opportunity to identify care management targets to improve health outcomes and control costs in the near term. 

The selection criteria examined for their relative cost impact offer several advantages for identifying target subpopulations. First, they are commonly tracked data elements readily available to identify patient panels in most care settings with an EHR system. Second, outcomes for these comorbid conditions can be improved through recommended treatments and better disease control. For example, improved glycemic, blood pressure, and weight control can lower rates of cardiovascular, renal, and lower limb complications for patients with diabetes.28 Similarly, guideline-based management for asthma, COPD, CHF, and depression has been shown to improve health outcomes and reduce morbidity, including rates of hospital-based care.38-40 Third, diagnosis coding indicates that the subpopulations identified by the selection criteria had potentially avoidable hospital admissions both accounted and not accounted for by the ACSC measure. This finding suggests that identifying this subpopulation presents an expanded opportunity for further reducing admissions and expenditures by improving disease management and preventive care. 

An important next step is to prospectively test the use of the selection criteria to guide outreach, care management, and coordination and to determine whether the near-term utilization and financial goals can be achieved in 1 to 2 years for a community or regional population. 


This study had several limitations. First, clinical data was available for only 6719 (35%) of the 19,000 patients with diabetes and claims data in the APCD. More complete data may or may not alter the findings, although the likelihood that the findings would change is lessened given the demographic and health status similarities between the groups with clinical data and those without. This conclusion can be confirmed as the state continually and systematically improves the scale and quality of available clinical data. Second, the selection criteria for identifying subpopulations were based on their relative association with increased expenditures and utilization. Although this is valuable for targeting utilization and cost drivers, this approach may not be the best for improving long-term health and wellness. However, these selection criteria can guide preventive care with the potential for near- and long-term positive impacts.


The findings in this study showed how Vermont’s policy decisions are translating into a data utility that supports a high-performing health system. In addition to its efforts of statewide multipayer delivery system reform through the Vermont Blueprint for Health program,20 the state’s commitments to developing and maintaining an APCD and an HIEN have established the data infrastructure to support the work of primary care medical homes and community health teams while improving coaching and transformation support in each service area.10 A culture of data use and shared learning continues to emerge across the state as comparative performance profiles guide ongoing improvement activities. In effect, the state’s policy decisions are steadily leading toward a data utility that can be used to drive better care and lower costs for all its residents. Infrastructure of this scale and scope would likely not be developed without sustained public commitment. 
The authors want to acknowledge Vermont’s policy leaders who in 2008, in a bipartisan effort, initiated a health information technology fund and sustainable path for developing state data infrastructure. Those with key roles include Governor Jim Douglas, Secretary of Civil and Military Affairs Heidi Tringe, Secretary of Administration Mike Smith, Director of Healthcare Reform Susan Besio, Chair of Senate Appropriations Susan Bartlett and subsequent Chair of Senate Appropriations Jane Kitchel, Member Senate Health Committee Kevin Mullin, and Chair of House Health Committee Steve Maier. They also want to recognize the vision, leadership, and passion provided by Vermont’s Director for Health Information Technology at the time, their recently deceased and dearly missed friend, Hunt Blair. Most importantly, they want to acknowledge the deep commitment by the clinicians and office staff in Vermont’s patient-centered medical home practices to complete, high quality data. The authors especially want to highlight the work of Laurel Ruggles, Joyce Dobertin, MD, and the providers in St. Johnsbury, Vermont, who worked with the Blueprint on the first data quality “sprint” in Vermont, an effort planned for 4 weeks that took 26. Finally, they want to acknowledge the information technology staff working in Vermont’s hospitals and health centers, who have worked diligently to improve the completeness and quality of clinical data that is being transmitted into VITL’s HIEN and the Blueprint’s registry.
Print | AJMC Printing...