Kira L. Ryskina, MD, MS; Eric S. Holmboe, MD; Elizabeth Bernabeo, MPH; Rachel M. Werner, MD, PhD; Judy A. Shea, PhD; and Judith A. Long, MD
Overtreatment in medicine, defined as “the waste that comes from subjecting patients to care that, according to sound science and the patients’ own preferences, cannot possibly help them,”1
is estimated to account for nearly 30% of healthcare spending.2
Increasing awareness that diagnostic and therapeutic interventions that physicians order are in some instances unnecessary3
has culminated in widely disseminated overtreatment guidelines, such as the Choosing Wisely campaign.4
Introduced in 2012, Choosing Wisely, an initiative of the American Board of Internal Medicine (ABIM) Foundation, has partnered with Consumer Reports and other medical organizations to provide physicians and patients with lists of potentially avoidable tests, treatments, and procedures.
In a 2014 telephone survey of 600 physicians, 38% of respondents reported having seen or heard about the Choosing Wisely campaign and 81% reported feeling “very comfortable” about “talking to patients about why they should avoid an unnecessary test or procedure.”5
Despite the positive response from practicing physicians, there is little evidence that guidelines alone influence physicians’ ordering decisions. In fact, a recent report using commercial health plan claims data to evaluate the utilization of 7 services targeted by the guidelines failed to detect a meaningful decline in their use.6
However, the study looked at global use of services by health plan beneficiaries without accounting for physician characteristics. For example, a 2012 study of Massachusetts health plan data revealed that physicians with fewer than 10 years’ experience had the highest cost profiles compared with those of more senior physicians.7
Whether physicians’ awareness of overtreatment guidelines reduces the propensity to recommend a targeted service remains unknown.8
To explore possible explanations behind the higher cost profiles of less experienced physicians, we surveyed recent internal medicine residency graduates about their adoption of overtreatment guidelines. Our specific objectives were to: 1) assess physician views of overtreatment guidelines using a novel 5-item scale, 2) estimate self-perceived practices according to select guidelines using hypothetical patient presentations, and 3) measure whether perceived adoption of the guidelines correlates with the likelihood to recommend a targeted service.
A literature review revealed no previously validated instruments evaluating physician attitudes toward overtreatment guidelines. To identify potential items for cognitive testing, we reviewed the literature, combed references from previously reported studies of physician views,9-19
and interviewed experts. Items were developed to assess: 1) physician awareness, agreement with, and use of overtreatment guidelines; 2) self-perceived propensity to recommend a service targeted by the guidelines; and 3) other potential confounders of physician practice identified in prior studies. We conducted 2 cycles of cognitive pilot testing to calibrate the wording to detect differences among physicians about these topics. After initial “think aloud” reviews with local (ie, Pennsylvania) practicing physicians, followed by revisions, we performed broader pilot testing with 100 internal medicine physicians randomly selected from the American Medical Association (AMA) Masterfile.
The final survey included questions about physician demographics, practice characteristics, attitudes known to influence overtreatment, views on overtreatment guidelines (awareness of, agreement with, and usefulness in practice), and self-reported practice in specific clinical scenarios related to the guidelines (eAppendix A [eAppendices available at ajmc.com
]). Self-reported practice was assessed using fill-in-the-blank questions asking physicians to estimate the percentage of their patients they recommended for a specific test or treatment. Specifically, respondents were presented with brief descriptions of patient presentations for: 1) low back pain, 2) acute sinusitis, 3) cancer screening with a life expectancy of fewer than 10 years, 4) cardiac screening in asymptomatic routine care, and 5) a low pretest probability of venous thromboembolism (VTE). For example, when asked “For what percentage of patients with acute lower back pain do you order the following?” the respondent would fill in a percent for x-ray, magnetic resonance imaging (MRI), physical therapy, acetaminophen/anti-inflammatories, and opioids. The fill-in-the-blank questionnaire regarding treatment decisions has been shown to have high criterion validity (ie, it correlates with actual practice on similar patients) in prior studies.20,21
To establish content validity, these items were tested by 11 clinical and survey design experts, including practicing primary care clinicians, researchers, and experts in survey design.
Using the AMA Masterfile, we pre-screened 2170 randomly selected internal medicine physicians who completed training within the last 10 years to confirm qualifying specialty, mailing address, and that the physician was actively seeing patients at least 20 hours a week. The final sample included 902 internal medicine physicians who were mailed a paper survey between July 2014 and January 2015 using a modified Dillman method.22
The initial mailing was done by first class mail accompanied by a $2 bill and followed by 2 reminder mailings approximately 6 weeks apart.
Overtreatment Guidelines Adoption Scale
A set of 9 questions comprised the Overtreatment Guidelines Adoption (OGA) Scale, which assessed physicians’ attitudes toward overtreatment guidelines and cost containment in general. Six questions focused on awareness of, agreement with, and perceived usefulness of overtreatment guidelines; comfort denying patient requests for tests or treatments; comfort discussing costs with patients; and self-perception of cost consciousness. These were assessed using a 4-point Likert scale, ranging from strongly disagree to strongly agree. A second set of 3 questions assessed how frequently physicians discussed costs, used the overtreatment guidelines in practice, and found these guidelines to be useful. These were measured using a 5-point Likert scale of frequency.
To summarize overtreatment guidelines, their adoption, and measure physician attitudes toward guidelines separately from general attitudes toward cost containment, we developed 2 subscales: a 5-item OGA subscale and a 4-item cost-containment subscale, using standard factor analysis techniques (eAppendix B). The OGA subscale possible values ranged from 5 to 22; higher scale scores reflected a higher degree of adoption of overtreatment guidelines. The OGA subscale had high internal consistency with Cronbach alpha of 0.82 and rotated loadings of 0.44 to 0.75. Principal components analysis supported a separate cost-containment subscale of 4 questions related to costs (Cronbach alpha, 0.76; rotated loadings, 0.51-0.70).
Our main outcome measures were self-reported percentages of patients who were advised to elect 8 services targeted by 5 overtreatment guidelines. The guidelines were selected because they described common clinical scenarios in internal medicine, had been released at least 2 years prior to our survey, and were endorsed by multiple medical groups (Table 1
). We asked physicians to fill in the blank with the percentage of their patients they advised to receive a particular service. The options included services targeted by overtreatment guidelines, as well as other management options commonly offered to patients in each clinical context. The following tests and treatments were measured: x-ray and MRI imaging for acute low back pain; antibiotics for mild to moderate sinusitis; breast, prostate, and colon cancer screening for patients with a life expectancy of fewer than 10 years; electrocardiogram (EKG) testing for asymptomatic patients; and computerized tomography scan as the initial test for low-risk patients with possible VTE. Services that were recommended to less than 5% of patients (eg, Papanicolaou test for cervical cancer and stress test for cardiac testing in asymptomatic patients) were excluded from analysis.
Physician demographics, attitudes, reimbursement, and practice characteristics that may confound the relationship between physician views of, and practice according to, overtreatment guidelines, were included in the analysis. Physician demographics included age, gender, and race. Other physician characteristics included practice region, type of practice, compensation type, financial incentives (eg, quality, patient satisfaction, utilization review, productivity), insurance mix (eg, patients with Medicaid insurance), and attitudes (eg, comfort with clinical uncertainty, satisfaction with the practice of medicine in general, malpractice concerns). These items were either drawn from the AMA Masterfile (ie, age and gender) or included questions drawn from previously validated surveys of physicians.
Responses were entered in the REDCap electronic data capture tool (Harvard Catalyst; Boston), hosted at the University of Pennsylvania.23
Ten percent of entries were double-entered with perfect concordance. The data were exported in, and all analyses were conducted, using STATA version 13.0 (StataCorp; College Station, Texas).
We used the American Association for Public Opinion (Research Response Rate 2) definition.24
Nonresponse bias was assessed by comparing respondents with nonrespondents and early to late respondents using the Pearson χ2
The reported percentages of patients who were recommended for a particular test or treatment indicated a discrete number of events over a constrained range (0%-100%) and were positively skewed. Thus, the reported percentages were converted to a count variable based on a denominator of 100 (ie, 10% was converted to 10 out of 100) and modeled using Poisson regression. The independent variable of interest was a trichotomized OGA scale. Other variables in the model included a scale of physician attitudes toward cost containment in general (measured using the cost-containment subscale), physician demographics, practice characteristics, and attitudes previously shown to be associated with overuse (see “Other Variables” section of text). The predicted percentage of patients recommended for a particular test or treatment were estimated. Bootstrapping with 1000 iterations was used to estimate 95% confidence intervals (CIs). This study was reviewed and approved by the University of Pennsylvania Institutional Review Board.
Of the 902 potential respondents, 456 (51%) returned a completed survey. No differences between respondents and nonrespondents were observed by age, gender, region of current practice, or practice setting (Table 2
). Aside from Asian or Asian American respondents being overrepresented among late responders, there were no differences between early and late responders regarding gender, primary compensation, organization or setting of practice, or self-reported attitudes or satisfaction with medicine as a practice (eAppendix Table 1). Nearly half of the respondents self-characterized primary compensation type as salary with bonus (49.5%), followed by billings (28.1%) and salary only (20.9%), and the majority reported compensation linked to quality of care (62.9%) or productivity (65.1%) (Table 3
). Less than 5% of respondents (4.2%) completed residency in 2013. Other characteristics of the respondents’ practices are reported in Table 3.
Respondents’ attitudes toward cost containment are shown in Table 4
. Most (88.5%) considered their practice style to be cost conscious. One in 4 (25.1%) reported discomfort discussing costs of care with patients, and 34.7% said they would not feel comfortable making a patient unhappy by denying a request for unnecessary care.
Respondents generally reported high levels of awareness, familiarity, and use of overtreatment guidelines (Table 4). Most (88.5%) reported being familiar with overtreatment guidelines in their specialty, 81.6% reported that the guidelines were useful in their practice, and 79.9% said they felt comfortable bringing up overtreatment guidelines in discussions with patients. However, less than 30% of respondents rated their agreement with these statements as “strong.” Respondents generally reported using overtreatment guidelines in practice with high frequency: 30.9% reported bringing up the guidelines in discussions with patients “frequently” or “always” and 44.2% reported bringing up the guidelines “occasionally.” Approximately 40% of respondents (41.1%) found the guidelines useful in practice “frequently” or “always,” and 42.4% found them “occasionally” useful. When individual responses were combined in the 5-item OGA subscale, the mean scale score was 15.6 (SD = 3.0) and the median 16 (interquartile range [IQR] = 14-18; observed range = 5-22).
In the fully adjusted models, respondents in the middle or top third of OGA subscale scores reported lower rates of recommending a test or treatment targeted by the guidelines for imaging for lower back pain, antibiotics for sinusitis, and cardiac testing for asymptomatic patients compared with the respondents in the bottom third of OGA scores (Figure
). Physicians in the highest tertile of guideline adoption reported double-digit rates of recommending antibiotics for sinusitis (29.7%), mammogram at end of life (16.5%), and EKG testing for asymptomatic patients (11.0%). Physicians with OGA scores in the top third had significantly lower predicted rates of recommending x-rays (–12.0%; 95% CI, –19.4% to –4.5%; P
= .002) or MRI (–4.8%; 95% CI, –8.1% to –1.5%; P
= .004) for lower back pain and EKG for asymptomatic patients (–10.2%; 95% CI, –18.9% to –1.5%; P
= .02) compared with physicians in the bottom third of OGA scores. Physicians with OGA scores in the middle third also had lower predicted rates of recommending antibiotics for sinusitis (–6.9%; 95% CI, –13.0% to –0.8%; P
= 0 .03) and EKG for asymptomatic patients (–8.7%; 95% CI, –15.9% to –1.4%; P
= .02) compared with physicians in the bottom third of OGA scale scores. The differences in predicted probabilities across the tertiles of OGA scale scores were not significant for cancer screening and imaging as the initial test for patients at a low risk of VTE (Figure).
The association between physician cost consciousness and the percentage of patients recommended for a test or treatment targeted by the guidelines was not consistent: physicians in the top third of cost-consciousness scale scores reported lower rates of prescribing antibiotics for sinusitis and recommending mammography at the end of life, but this association was not observed for the other guidelines (eAppendix Table 2). Other factors associated with recommending services targeted by the guidelines were physician age, practice region, type and setting, practice that treated patients with Medicaid, and satisfaction with medicine as a profession.
In this survey study of physician views of overtreatment guidelines, internal medicine physicians generally reported high levels of awareness, agreement, and use of the guidelines in everyday practice, and their attitudes toward the guidelines were distinct from their attitudes toward cost containment. In addition, physicians who reported greater adoption of overtreatment guidelines recommended fewer tests or treatments targeted by some overtreatment guidelines, even after accounting for their overall cost consciousness. Physicians who reported the highest levels of guideline adoption, however, also reported recommending services targeted by the guidelines in their practice.
Although most physicians generally reported agreement with overtreatment guidelines, only about one-third of the respondents rated their agreement as strong or reported using the guidelines frequently, suggesting considerable ambiguity in their attitudes toward overtreatment. Consistent with this finding, recommended rates of some of the services targeted by the guidelines (eg, x-rays for lower back pain, antibiotics for acute sinusitis) were high even for physicians in the top third of overtreatment guidelines adoption. On the other hand, most respondents (88.5%) assessed their practice style as cost-conscious. These findings suggest that even among physicians who generally had positive attitudes toward cost containment, perceptions of the use of overtreatment guidelines were poor, potentially limiting their impact on physician behavior.
Considering these findings, the lack of a consistent decrease in the use of tests and treatments targeted by the Choosing Wisely campaign is not surprising.6
Of note, although some of the guidelines (eg, cancer screening) categorically recommend against testing when patients meet certain criteria, many guidelines implicitly or explicitly allow for exceptions (eg, for worsening symptoms or prolonged duration of acute sinusitis). These important distinctions were difficult to capture in a survey question that did not ascertain how frequently physicians saw patients that meet the exclusion criteria in the guidelines. Nevertheless, it is unlikely that the variation and high rates of targeted services reported by some of the respondents would be fully explained by variation in case mix.
The 4 services for which we did not observe an association with the OGA scale scores (ie, mammography, colonoscopy, prostate cancer screening, and imaging for VTE) were targeted by guidelines that did not include exceptions in certain patient presentations or for duration of symptoms. This contrasts with the other 4 services that were targeted by guidelines that were worded to include exceptions for certain patient presentations (ie, antibiotics for acute sinusitis, which recommended against ordering antibiotics unless symptoms lasted longer than 7 days or worsen, or for acute lower back pain that is nonspecific). This suggests that guidelines that are more categorically worded may be less likely to influence physician behavior. However, our study was not powered to determine the significance of this pattern. Future research should evaluate the effect of guideline wording on physician behavior.
The 8 tests and treatments evaluated in this study were selected to correspond to recommendations of the Choosing Wisely campaign, which had advantages, including widely disseminated endorsement by multiple professional physician organizations. All recommendations included in the study were proposed by 3 or more specialty groups. The Choosing Wisely campaign leaves the mechanism of endorsement up to the group, emphasizing the grassroots characteristics of the campaign. Specialty groups play a lead role in developing the lists of recommendations, an approach designed to appeal to physician professionalism and establish specialty-endorsed norms of care. However, a review of the recommendations by the first 25 professional groups that participated in the Choosing Wisely program raised concerns that groups may be reluctant to endorse recommendations limiting the use of services that are highly lucrative to the specialty.25
Furthermore, the extent of regional and local professional groups’ involvement in the development of national specialty societies’ Choosing Wisely recommendations is not clearly mandated by the campaign. Hence, regional variation in the propensity of physicians to recommend some services may be less responsive to guidelines endorsed at the national level. Although practice region was significantly associated with only 1 of the 8 services evaluated (EKG for asymptomatic patients), local costs may influence physicians’ recommendations of specific tests and treatments. Future studies should assess how physician perceptions of costs influence their recommendations of services targeted by overtreatment guidelines.
Even in cases in which relatively strong consensus exists regarding the evidence base for optimal care, such as the overtreatment guidelines evaluated in this study, a complex interplay of working environment and personal factors plays a role in physician recommendations.26
Whereas overtreatment guidelines target intrinsic motivation in practicing evidence-based care, policy-level interventions typically focus on extrinsic motivators, such as value-based payments, bundling of payments, or other types of monetary incentives.27,28
Our findings provide empiric evidence supporting the importance of evaluating the effect of intrinsic and extrinsic motivators on physician behavior within the context of practice environment and physician characteristics.
While a mix of incentives could be calibrated to achieve value-based care in theory, in practice, these factors are in flux and conflict with each other at times. Although the current study evaluated the adoption of overtreatment guidelines within the context of environmental (ie, treatment facility), practice, and physician-level factors,29
we were unable to evaluate actual physician practice or compare the relative effect of alternative motivators. Behavioral theory suggests that getting physicians to “de-adopt” practices is more challenging than the adoption of healthcare innovations.30
Moreover, physicians often lack self-awareness of the nonclinical factors that may influence their behavior.31
Although overtreatment guidelines that are evidence based and disseminated in a transparent way may be successful in engaging physicians to consider these issues, the sheer magnitude of factors that influence physician behavior suggests that overtreatment guidelines alone are unlikely to produce a sizeable impact on overuse.
Alternative explanations of the observed associations between overtreatment guideline adoption and the rate of recommending targeted services include patient case mix, social desirability bias (ie, underreporting undesirable behaviors, such as the use of services targeted by the guidelines in our study), and recall bias. Although case vignettes with open-ended answer options have high criterion validity (ie, they correlate with actual practice on similar patients),20,21
reported practices may not represent actual physician recommendations to their patients. Furthermore, the proximity of questions about practice patterns and overtreatment guidelines in the questionnaire may have primed respondents to underreport overtreatment.
Concerns about priming and desirability bias suggest that the rates of recommending services targeted by the overtreatment guidelines may be underestimated in this study. Although we obtained a relatively high response rate for a physician survey, and our respondents were similar to the general population (ie, nonrespondents and early and late respondents), the potential for response and selection bias remains.
Lastly, despite efforts to confirm physician eligibility during prescreening, specialty and contact information in the AMA Masterfile may be inaccurate, introducing respondents who fall outside our target sample. A recent comparison of AMA Masterfile physician contact information with other databases found that only 37% were accurate.32
In a national survey, the majority of US internal medicine physicians reported positive attitudes toward overtreatment guidelines in their specialty. However, physicians’ recommendations in guideline-specific standardized patient cases varied. Physicians’ propensity to recommend low-value services was explained in part by physician and practice characteristics. The complexities of physician decision making may explain an observed lack of reduction in the utilization of tests and treatments targeted by such widely disseminated overtreatment guidelines as the Choosing Wisely initiative. Guidelines or similar broad educational interventions by physician organizations are unlikely to reduce physician-level variation in the utilization of low-value services. Furthermore, interventions to reduce low-value care should be evaluated within the context of health system-, practice-, and physician-level factors to avoid unanticipated effects.