Gregory B. Steinberg, MB, BCh; Bruce W. Church, PhD; Carol J. McCall, FSA, MAAA; Adam B. Scott, MBA; and Brian P. Kalis, MBA
The growing prevalence of metabolic syndrome in the United States, and globally, is alarming. Metabolic syndrome is generally defined as having three or more of five common biological abnormalities out of range: waist circumference, blood pressure, elevated triglycerides, low high density lipoproteins (HDL),and increased insulin resistance. Analysis1
suggests that almost one-third of US adults, or approximately 80 million people, meet the Adult Treatment Panel III criteria for metabolic syndrome, with prevalence increasing significantly with age and body weight.2
An additional 45%, or approximately 104 million people, have 1 or 2 risk factors for developing metabolic syndrome.
These trends have profound clinical and financial implications. Individuals with metabolic syndrome are twice as likely to develop cardiovascular disease and 5 times more likely to develop diabetes mellitus, both of which mean higher than average annual healthcare costs. Workplace participation and productivity of individuals with metabolic syndrome are also negatively impacted.3
Health insurance companies have large quantities of data relevant to metabolic syndrome, including demographic data, diagnosis and procedure claim data, lab results, prescription data, and care management program data. Using “big data analytics” to interrogate large, complex data sets can generate meaningful insights about individuals with or at risk of developing metabolic syndrome.
We applied a proprietary “big data” analytic platform— Reverse Engineering and Forward Simulation (REFS)—to the data set of 1 of Aetna’s larger nationwide retail customers and calculated:
The subsequent risk of metabolic syndrome, both overall and by metabolic syndrome risk factor, at both a population and individual level
The impact of incremental changes in risk factors on the overall subsequent risk of metabolic syndrome and on costs
The impact of adherence to medications and to routine, scheduled outpatient doctor visits on the subsequent risk of metabolic syndrome.
Big data analytic techniques of this type rapidly yiled insights that support data-driven targeted interventions for people with or at risk of developing metabolic syndrome. Aetna is currently piloting an intervention program based upon the results.METHODS
The REFS platform is best used to analyze and simulate large, dynamic, multisource data sets. The platform learns by reverse engineering ensembles of models that represent the diversity of processes consistent with the data and then simulating nonparametric knowledge representations to generate accurate, granular group and individual predictions that are both actionable and generalizable. Accurate insights from available data can be generated within a few months, and new data easily integrated. The speed-to-insight allows care providers to develop effective therapeutic programs and interventions quickly and cost-effectively, ultimately lowering the cost to serve the affected populations.Data Sources
Data for this study were gathered from:
Insurance eligibility records
Comprehensive Metabolic Syndrome Screening (CMSS) results
Health risk assessment (HRA) responses
The CMSS results provided the core outcome variables for the study, and measured each of the 5 metabolic syndrome factors (including systolic and diastolic blood pressure). Screenings were conducted twice: once at the beginning of 2011 and again in early 2012, for an initial cohort of 59,605 people. We then restricted the study to participants for whom we had: complete coverage records from January 1, 2010, through December 31, 2011; complete data from medical claims, pharmacy claims, or test lab results for 2010 and 2011; and valid responses to a small set of HRA questions. This resulted in a study population of 36,944, which was then randomly assigned to either an 80% training set (N = 29,527) or a 20% test set (N = 7417). The study population metabolic syndrome risk and medical cost profile is found in Figure 1
. Additional demographic detail is found in eAppendix Figure 1
.Variable Creation and Definitions
The 4291 variables in the analysis spanned 6 different data categories. The specific breakdown of data categories is found in eAppendix Table 1
. Continuous variables were discretized into ranges in preparation for modeling with multivariate categorical models. The ranges of the CMSS factors were constructed from metabolic syndrome out-ofrange boundaries and other clinically relevant boundaries.
Demographics captured 5 dimensions in addition to gender: age, body mass index (BMI), ethnicity, cigarette usage, and sleep. In addition, 4 event types were defined from claims: diagnoses, procedures, provider specialty, and prescriptions. Further detail regarding demographics and events is found in eAppendix Figure 1. An indicator variable identified the year in which an event occurred.1. Lab results
. Results from 24 common lab tests (as identified by Logical Observation Identifiers Names and Codes number) were extracted for each year. Results were discretized in up to 7 ranges.2. Biometrics
. For each of the CMSS biometric screenings conducted, 6 variables were created (the 4 single-metric metabolic syndrome factors and systolic and diastolic blood pressure values). The values were then segregated into 7 ranges for blood pressure and 6 ranges for the remaining CMSS factors. In cases where the biometric corresponded to a lab test, the same discretization was used.3. Medication adherence
. We calculated a subject’s medication possession ratio (MPR) for 4 classes of medication: antidiabetics, antihyperlipidemics, antihypertensives, and other cardiovascular medications. More detailed information on MPR calculus is found in eAppendix Table 1. An MPR of 80% or higher was considered adherent.4 For each year and each category of medication, a subject was categorized as: N/A (no prescriptions of that type), once and done (1 prescription of that type), not adherent, or adherent.
PDF is available on the last page.