Using Health Outcomes to Validate Access Quality Measures
Julia C. Prentice, PhD; Michael L. Davies, MD; and Steven D. Pizer, PhD
A crucial piece of the Affordable Care Act legislation is the implementation of accountable care organizations (ACOs) that financially reward a group of providers for providing highquality healthcare at low cost.1,2 ACOs may provide the greatest benefits to patients diagnosed with chronic disease due to their focus on providing high-quality, well-coordinated, and patient-centered care.1,2 Patients diagnosed with diabetes are likely to be targeted for disease management by ACO managers because of the high prevalence and significant monetary costs of managing diabetes complications. Approximately 17.5 million individuals in the United States have diabetes, and the total cost related to diabetes in 2007 was $174 billion.3
The success of ACOs relies on robust quality measurement.1,2 Policy makers have emphasized the need to complement process quality measures (eg, the percentage of patients with diabetes who receive glycated hemoglobin [A1C] tests) with patient-centered measures that demonstrate improved patient outcomes.2,4,5 Timely access to healthcare has a strong potential to meet this criterion.
Previous research suggests that access measures that do not rely on patient self-report reliably predict health outcomes. Studies using capacity wait time measures (eg, time to first next available appointment [FNA]) and comparing populations exposed to open access scheduling with controls have found patients who face shorter waits have higher primary care utilization, lower A1C levels, and a decreased risk of experiencing poor health outcomes such as mortality or preventable hospitalizations.6-10 Notably, in the context of ACOs, Prentice and colleagues6,7 found the effects of longer FNA waits were largest with the most vulnerable veterans—those who were older, had more comorbidities, or had higher A1C levels during baseline.
Adherence may be one factor that contributes to the observed relationship between wait times and outcomes. Several studies indicate that patients diagnosed with diabetes who are more adherent to their treatment plan experience better health outcomes and have lower healthcare cost.3,11,12 Cramer and colleagues13 found a “white coat” effect, with adherence to medication increasing right before and right after physician visits. Consequently, longer waits between visits may decrease a patient’s ability to get the most effective treatment and lead to lower overall medication adherence. Patients with diabetes often underestimate the importance of consistent treatment adherence,3,14 and longer waits between appointments will decrease opportunities for patient education.
Although the importance of timely access to healthcare is widely recognized, the best method of measuring timely access has yet to be determined. The Centers for Medicare & Medicaid Services are currently using patient surveys to evaluate ACOs’ ability to deliver healthcare as soon as it is desired.15 An alternative approach relies on administrative data from patient scheduling systems. For example, capacity measures (eg, the number of days until the first or third next available appointments) are commonly used when implementing Advanced Clinic Access.16,17 The Veterans Health Administration (VHA) found that capacity measures may not reflect the access limitations actually experienced, especially for returning patients who are typically trying to schedule follow-up appointments.18 Consequently, a variety of wait time measures have been developed. This study is the first to compare the abilities of these alternative measures of wait times to predict glycemic control among patients diagnosed with diabetes. Results could be used in Medicare and the private sector to improve upon the current quality metrics used by ACOs.
This study used administrative data from a wide variety of VHA and Medicare data sets. Please refer to Appendix A for an overview of all the data used. Using the VHA Pharmacy Benefits Management file, we chose all individuals who were prescribed a glucose-lowering medication in 2005 or 2006. These years were used as the baseline year for risk adjustment. Outcomes were measured starting in the year following baseline (2006 or 2007). American Diabetes Association guidelines at the time of the study called for A1C levels to be tested biannually for individuals exhibiting glycemic control and quarterly for individuals not in control.19-21 We split the outcome period into ten 6-month observation periods starting with January to June 2006 and ending with July to December 2010.
To help ensure that we had complete claims data for baseline risk adjustment, we required the sample to be enrolled in Medicare fee-for-service plans. We excluded individuals who were enrolled in a Medicare health maintenance organization; veterans with higher priority levels (7 or 8), who might have had private insurance claims we could not access. Other exclusions included individuals who died during the outcome period, because A1C levels could have changed right before death; veterans in a Department of Veterans Affairs (VA) nursing home during the baseline period, who might not have relied on VA outpatient care; individuals with missing A1C levels during the baseline period; and individuals with missing race data. The final sample size was 195,842 people.
Types of Wait Time Measures
We obtained 5 distinct wait time measures from VHA scheduling system records from 2006 through 2010: (1) capacity (FNA); (2) retrospective create date (CD); (3) retrospective desired date (DD); (4) prospective CD; and (5) prospective DD. Table 1 describes how each of these measures was calculated, and Prentice and colleagues18 provide a detailed overview of the measures.
Briefly, the FNA uses the day an appointment is created as the starting point and measures the time between that day and the day the first available open appointment slot occurs. Individual patients may not actually want the FNA appointment because they are looking for a follow-up appointment in the future. This is likely to be common for returning patients who typically wish to schedule a follow-up. New patients are more likely to want to be seen as soon as possible.22
To overcome this limitation, the VHA developed time stamp measures that measured how long individual patients waited. Time stamp measures can use a CD or a DD as the date to start measuring waits (Table 1). The CD is the date that an appointment is created (ie, made) in the appointment system. The principal limitation of CD is that it measures the pattern of booking appointments. For example, suppose a patient comes in for a checkup and the patient and provider agree to schedule a follow-up appointment in 6 months. If the clinic creates the follow-up appointment on the day of the initial appointment (“on today”), the resulting measured wait time will be 6 months. Alternatively, the clinic might contact the veteran 5 months from “today” (1 month before the intended 6-month follow-up appointment) and create the intended 6-month follow-up appointment at that time, resulting in a measured wait time of 1 month. This limitation is surmounted through the DD time stamp measure that designates the ideal time “a patient or provider wants the patient to be seen.”23-25 In this example, the DD is the date of the 6-month follow-up appointment that the patient and provider agreed upon.
In addition to different start dates for CD or DD, time stamp measures can have different ending points. One ending point is the day an appointment is completed, resulting in a retrospective time stamp measure (completed appointment— CD or DD). A second ending point is a bimonthly snapshot of all pending appointments in the VHA, resulting in a prospective measure (pending appointment date—CD or DD).
Facility-Level Wait Time Measures Compared With Individual Wait Times
When computing any of these measures for use in outcomes models, it is tempting to calculate a wait time measure based on services an individual actually used. This approach is problematic because unobserved individual health status affects individual wait times as well as individual outcomes due to medical triage. Medical providers identify patients who are in poorer health when they call to request an appointment and refer these patients to clinics with shorter waits. Thus, individual health status is affecting individual wait times and potentially obscuring the effect of wait times on health status (for an example, see Prentice and Pizer9,10). Although statistical controls for observable differences in health status will reduce the severity of this problem, we are not able to measure health status precisely enough to eliminate it.6,7,9,10,26
To avoid this problem and isolate the effect of wait times on health, we computed facility-level averages for each wait time measure based on a fixed pattern of clinic utilization.6,7,9,10 Averages were calculated separately for new and returning patients. Missing wait times were imputed with zero when appropriate.6,7,9,10,26
Individual-Level Explanatory Variables for Risk Adjustment
The modeling and analytic strategy followed previous research that established the link between wait times and glycemic control.6 Individual-level explanatory variables included age, sex, and race from the Medicare Denominator File, distance to VHA care, and VHA priority status (1, 2, 3, 4, and 6 compared with 5). VHA policy generally provides preferential access to veterans in low-numbered priority groups due to service-connected disabilities, so wait times may affect these priority groups differently. Veterans in priority group 5 are low-income veterans with no service-connected disability, so they were distinguished from the other priority groups in our analyses.27 Longer driving distances to the source of care have been found to be associated with poorer glycemic control,28 and veterans with higher priority access likely experience shorter wait times.
Models were risk adjusted to control for observable differences in prior individual health status. We extracted diagnosis codes from all inpatient and outpatient encounters financed by VHA and Medicare during the baseline period (see Appendix A for data sources) and used the International Classification of Diseases, Ninth Revision, Clinical Modification diagnosis codes listed by Elixhauser and colleagues29 to define 28 comorbidity indicator variables, which included a wide variety of physical and mental conditions. The diabetes severity index developed by Young and colleagues30 was used to control for diabetes severity. This index included measures of complications from retinopathy, nephropathy, neuropathy, cerebrovascular disease, cardiovascular disease, peripheral vascular disease, and metabolic disease. To control for baseline A1C, we categorized the average A1C levels during the baseline year as lower than 7%, higher than 7% but lower than 8%, and 8% or higher.
Facility, Yearly, and Half-Year Fixed Effects
Models included dummy variables (fixed effects) for each facility to remove between-facility variation in wait times and outcomes.6 In effect, we compared the A1C level of an individual in 1 observation period with the A1C level of the same individual in other observation periods. This design eliminated concerns about permanent case-mix differences between facilities. Facility fixed effects also controlled for all aspects of facility quality that remain constant over time (eg, managerial inefficiencies).
We also included a dummy variable for January through June observations compared with July through December observations to control for any systematic variation in A1C between half-years, as well as yearly dummies to control for any overall increase or decrease in A1C levels over time. This statistical design, featuring a predetermined cohort of patients as well as time and facility fixed effects, means that any estimated relationship between waiting time and A1C level was identified exclusively by within-facility variations over time that were independent of national trends.
Outcome and Analyses
Data were analyzed using Stata version 10.0 (StataCorp, College Station, Texas). We modeled the average 6-month A1C level and uncontrolled A1C (6-month A1C average >9%) during each observation period. The average wait time for the previous 6 months predicted A1C level in the current 6-month period. Separate models were run for each of the 5 new and returning patient wait time measures. We standardized wait times to allow direct comparisons across measures. Standard errors were clustered on individuals to account for the lack of independence between observations from the same individual.
Patients in the hospital or nursing home during the wait time measurement period should not be affected by outpatient wait times, so we censored observation periods if the veteran was institutionalized for all 6 months of an observation period.
Despite using 6-month observation periods to maximize the availability of A1C level data, 32% of the values during the outcome period were missing. Missing values may have been due to a veteran not having his/her A1C tested in a VHA facility or a veteran being hospitalized during the 6-month observation period. Following Prentice and colleagues,6 we treated these observations as censored using a 2-stage Heckman selection model.31
The first stage of the Heckman model used a probit to explain whether or not an A1C level was observed. The second stage used linear regression to predict the average A1C value or a logistic regression to predict uncontrolled A1C. The 2 stages were jointly estimated so the missing observations were accounted for in the second stage. This simultaneous-equations approach explicitly modeled the correlation between unobservable factors in the first and second stages. The necessity of the Heckman model was confirmed with a significant Wald statistic that tested whether this correlation was zero and indicated that common unobservable factors affected both censoring and the outcome (Appendix B).
Similar to other samples of elderly VHA users, our sample was predominantly male and had a high burden of physical and mental health conditions. During the baseline period, about one fifth of the sample had an average A1C level greater than or equal to 8, a quarter of the sample had an obesity diagnosis code, 87% had a hypertension diagnosis, and 15% had a depression diagnosis (Table 2).
There was significant variation in measured wait times using the different methods of measurement for new and returning patients (Table 3). Wait time measures that relied on the CD for appointments had means of 20 to 34 days for new patients and 41 to 97 days for returning patients. The DD measures were shorter, with means of 7 to 18 days for new patients and 4 to 23 days for returning patients. The mean wait time for the FNA appointment capacity measure was similar to the retrospective CD measure for new patients and 8.1 days for returning patients.
The Heckman model is a 2-equation model that benefits from a variable that distinguishes the first equation from the second equation, and we used the number of VHA primary care visits during baseline for this purpose. More frequent visits during baseline significantly increased the likelihood of observing an A1C value in all the models (data not shown). As an example, Appendix B provides complete results for the first-stage equation of the model that predicted the linear A1C 6-month average using the retrospective CD wait time measure. The coefficient on VHA primary care visits was 0.032 (P <.001).
Wait time had small but statistically significant effects on A1C (Table 4). There was a significant (P <.001) and positive relationship between the FNA, retrospective and prospective CD measures for new patients, and average A1C levels, with the FNA measure having the strongest relationship (β = 0.009 vs β = 0.007, β = .006; Table 4). Among the new patient measures, retrospective CD was the strongest predictor of uncontrolled A1C (marginal effect = 0.0010; P = .001), but longer FNAs also significantly increased the likelihood of having uncontrolled A1C (marginal effect = 0.0007; P = .05). Neither of the new patient DD wait time measures significantly predicted A1C levels.
When considering returning patient wait measures, the prospective CD measure was the strongest predictor of A1C for both outcomes (β = 0.009 for linear A1C, P = .002; marginal effect = 0.019 for uncontrolled A1C, P = .001; Table 4). There was also a positive significant relationship between the DD wait time measures and the A1C outcomes and the uncontrolled A1C outcomes (P <.05 for linear A1C and P <.10 for uncontrolled A1C). The returning FNA wait time measure had a significant (P = .036) and negative relationship with linear A1C but no significant relationship with uncontrolled A1C. Neither outcome was significantly predicted by the returning patient retrospective CD measure.
The effect sizes were small and not clinically significant. For example, the largest observed effect was for the returning patient prospective CD when predicting uncontrolled A1C. An increase of 1 standard deviaton in this measure would increase the likelihood of a typical patient having uncontrolled A1C by 0.19 percentage points (Table 4).
Findings in this study suggest that longer wait times measured in a variety of different ways had small but statistically significant effects on A1C levels and the likelihood of having uncontrolled A1C. Specifically, the new patient capacity wait time measure (FNA) and the retrospective and prospective new patient wait time measures using CD exhibited expected relationships with A1C. Among the returning patient measures, the retrospective CD measure and the retrospective and prospective DD measures did so as well. These results are consistent with the previous research finding that the new patient FNA measure significantly predicts A1C.6
The ongoing implementation of ACOs requires quality measures that are linked to patient health outcomes.4 The relationship between process quality measures and improved health outcomes is often modest.5,6 Although not clinically significant, the administrative wait time measures reliably predict both A1C and patient satisfaction. This is significant because patients are more interested in improved health outcomes than the process of care.5 Another advantage of the wait time measures is their low cost. Access to care in ACOs is currently being evaluated by using the expensive and time-consuming process of surveying patients about their ability to get healthcare as soon as they wanted.15,18 Wait times based on administrative scheduling data are a less costly alternative.
The most appropriate wait time measures differ for new and returning patients. The ability of the capacity and CD versions of the new patient wait time measures to predict A1C when the DD measures did not supports previous research finding these same associations when predicting patient satisfaction.18 New patients typically want to be seen as soon as possible, often due to a change in health status that is causing concern.22 Consequently, it is not surprising that capacity or time stamp wait time measures that use the date that an appointment request was made as the start date (see Table 1) for measuring wait times are successful predictors for a variety of different outcomes. When considering ACO reimbursement, an advantage of these measures is that they can be easily calculated from most scheduling systems. The date that an individual requests an appointment is commonly cited as the start date to measure access outside of the VHA. For example, the Advanced Access literature uses this date when calculating the number of days until the third next available appointment.16,17
Developing consistent administrative wait time measures for returning patients is more complicated because these patients may not wish to obtain the next available appointment for follow-up care.18 Surveys of patients have found that scheduling future appointments at convenient times or maintaining continuity of provider may outweigh concerns about long waits for appointments for follow-up care.22,32,33 Recognizing these complexities, VHA policy makers shifted to using a DD approach in 2010 where schedulers ask patients what day they desire their appointment.25 Our results generally support the focus on DD for returning patients, with both the prospective and retrospective DD measures significantly predicting A1C. A disadvantage of implementing DD measures outside of the VHA is that schedulers in the private sector are not routinely collecting DDs when patients request appointments.
The main limitation of this study is that we did not have random variation in wait times, so we had to construct facility averages to minimize potentially confounding effects of individual health on individual waits. Consequently, we cannot completely rule out alternative explanations for our findings, including reverse causation and omitted variable bias. For example, an unobserved flu epidemic at a VHA facility could increase wait times facilitywide and cause higher A1C levels that are not attributable to longer wait times. Our analyses included facility-level fixed effects, yearly dummies, and a seasonal effect to minimize this possibility as much as possible. On the other hand, there is now a significant literature using these methods that consistently finds that longer wait times using capacity measures lead to poorer health outcomes.6,8-10,26 The growing evidence base utilizing different populations, time periods, and outcomes strengthens the likelihood that the relationship is causal.
The ongoing implementation of the Affordable Care Act and the growth of ACOs re-emphasizes the need for patientcentered measures that can lead to improved outcomes. Administrative wait time measures developed as performance measures in the VHA meet this criterion, given that these measures have consistently predicted several patient outcomes, including A1C levels.18 Consequently, the VHA further refined wait time measurements in 2012 by using the CD retrospective measure for new patients and the prospective DD measure for returning patients.34,35 The knowledge the VHA has cultivated in developing the DD measures and the data structures required to support them could be transferred to the private sector.