Electronic decision support for high-tech diagnostic imaging was associated with reduced volume and increased appropriateness, but had little impact on findings or patients.
To evaluate the effects of providing appropriateness criteria through guideline-based electronic health record (EHR) decision support for high-tech diagnostic imaging (HTDI) procedures.
Chart audits were performed on a random sample of adult primary care orders for 3 HTDI procedures (computed tomography [CT] and magnetic resonance imaging [MRI] of the head, and MRI of the lumbar spine) before and after implementation of an EHR decision support system.
Level of appropriateness, abnormal findings, and apparent effects on patient care.
A total of 299 eligible audits were performed. Decision support was associated with a 20% to 36% drop in spine MRI and head CT orders, but head MRI order volume was unaffected. Combined results for the 3 procedures showed that a larger proportion of studies ordered after implementing decision support (89.2% vs 79.5%, P = .02) fit appropriateness criteria, and more postimplementation studies had A ratings (highest utility rating) (81.8% vs 70%, P = .04). However, there were no differences in the proportion of tests with positive findings (42/132 vs 28/120, P = .16 among procedures that met definite
criteria) or the proportion with a likely impact on patients (6.6% vs 10.8%, P = .07).
These data support the feasibility of using chart audits to assess the relationship between appropriateness criteria and HTDI orders. Although introduction of EHR clinical decision support for diagnostic imaging orders was associated with reduced volume and increased appropriateness of orders, there was little apparent impact on either findings or patients.
(Am J Manag Care. 2010;16(2):102-106)
Implementation of decision support in the electronic medical record system of a large medical group reduced the volume and increased the appropriateness of high-tech diagnostic imaging orders for head and spine scans, although the effects varied by procedure.
Many policymakers suggest that conversion to electronic health records (EHRs) will improve the quality and cost of medical care.1-3 Others argue that such benefits will require that EHRs include at least electronic decision support.4-7 However, there still is not much evidence about how much value electronic decision support adds. The systematic review of controlled trials of computerized decision support systems by Garg et al concluded that two-thirds of 97 trials showed some improvement in care processes, but only 7 of 52 showed improvement in patient outcomes.8
One area in which electronic decision support is likely to be helpful is in high-tech diagnostic imaging (HTDI) test orders. These very costly tests have been increasing at a rapid rate nationally over the past 15 years.9-16 Published studies suggest that this increase is occurring among tests ordered by both radiologists and nonradiologists and that it is associated with wide geographic variation in rates. As a result, there is increasing pressure to encourage adherence to appropriateness criteria, a situation for which electronic decision support seems well suited.
Rosenthal et al reported preliminary results from the implementation of an electronic decision support system for HTDI procedures at Massachusetts General Hospital, showing that it was acceptable to physicians and appeared to change ordering habits.17 However, this conclusion was based on changes in concordance with appropriateness criteria in the decision support system over time, with no data comparing such concordance before and after the system began. Moreover, there is little direct evidence on which to base appropriateness criteria. Thus, we know little about the effects on patient care processes or outcomes of applying these criteria. To better understand these issues, we studied the impact of implementing a decision support system for HTDI in the EHR of a large multispecialty medical group. The following questions were addressed:
1. Was there sufficient information in the medical chart to permit measurement of appropriateness criteria for HTDI test orders?
2. Did compliance with appropriateness criteria change?
3. Did positive findings or effects on patients change?
This study was conducted within a 600-physician multispecialty medical group in the Twin Cities region of Minnesota that serves an active patient population of 400,000 patients. Since 2004, its primary care clinics have used the Epic paperless EHR that includes computerized orders for all tests and incorporates either electronic or scanned reports of those test results.
In February 2007, the medical group rapidly implemented a decision support system within Epic for HTDI tests. The system used appropriateness criteria from the American College of Radiology that have ratings of A (high utility), B (intermediate utility), and C (low utility). Because the EHR ordering system required that a reason be entered for all orders, and because these criteria could not be expected to cover all circumstances, an “other” alternative also was available, which collected unstructured information for about 15% (8%-19%) of orders for the procedures we studied. Only completed orders were recorded in the EHR.
The system did not prevent orders, regardless of criteria fit. Physicians received little feedback on their order results (although system use identified the appropriateness of each order). There were no financial incentives or disincentives associated with ordering patterns.
To avoid the preparation or acclimatization periods before or after implementation for this study, as well as to avoid any seasonal effects on procedure orders, we selected 2-month windows (6 months before and 6 months after the decision support system was implemented). We studied 3 of the more commonly ordered HTDI procedures: computed tomography (CT) and magnetic resonance imaging (MRI) of the head, and MRI of the lumbar spine, because these procedures have relatively fewer and less complex ordering reasons than many others.17 Our study was limited to original orders (ie, not follow-up of a previous test) by a primary care clinician for adults (age 19 years and over) in our primary care clinics to further simplify and standardize the analysis.
The implemented system contained 72 criteria for CT of the head and 68 for MRI of the head, but only 27 for MRI of the lumbar spine. “A” ratings were associated with 65% of these criteria for CT of the head, whereas 85% of the criteria for the other 2 procedures had A ratings. If the apparent reason in the chart for ordering a test did not resemble any of the existing criteria, we recorded this test as lacking a criterion fit.
Fifty cases of each completed procedure were drawn randomly during the selected months before and after implementation of decision support. They were excluded if the order had actually come indirectly from a specialty consultant, or was a follow-up to an earlier test.
Audits were performed by 2 experienced nurse auditors using predetermined definitions. The 2 physicians in our group (LIS, JB) separately reviewed every case with positive findings to confirm the audit results and decide on the patient impact, followed by discussion between them until agreement was reached. During the training period for the audit, all auditors reviewed the same 15 cases, and we as a group discussed the issues raised by the cases to attain a common approach to data collection. Blinding either nurse or physician reviewers to the dates of the procedure was impossible, because they had to search the EHR for information needed.
Results of a test were called positive when an abnormality either seemed to explain the patient’s problem or required a medical decision. The effect on care was categorized as 1 of the following:
• Probable impact on patient health or function
• Impact on the care process (usually additional consultation or testing)
• Uncertain impact on patients or care
• No impact.
The analysis was limited to descriptive statistics, although there were enough cases with adequate information in the record so that there was sufficient power to detect large differences, especially when we combined the cases for all 3 procedures. To evaluate the association between the pre- and postimplementation periods for the fit with appropriateness criteria, compliance rating, tests with positive findings, test results within the normal range, and test results that led to changes in care decisions, we performed 2-sided Fisher exact tests.
During the 2-month study period in 2007 compared with 2006, the volume of completed test orders for these 3 procedures dropped for head CT and spine MRI by 36.5% and 20%, respectively. However, completed test orders for head MRI increased by 3.3%. A total of 38 cases were excluded, leaving 151 cases before and 148 cases after implementation. Sixty-two percent of the cases involved women and the mean age was 57.8 years, with no pre/post differences. Only 6% of the procedures were ordered by nurse practitioners.
shows that the great majority of orders could be definitely rated by the criteria in both the pre—decision support and post–decision support periods, but only MRI of the head and spine showed an increase in definitely meeting appropriateness criteria after decision support was present. Only MRI of the head showed an increase in the proportion of orders rated as A in the postperiod. More than 90% of the head CTs were ordered for headache, trauma, or a neurologic deficit, and nearly all the spine MRIs were ordered for back pain or radiculopathy. Head MRIs also were ordered for headache or neurologic deficits, but many were ordered for evaluation of vertigo or other conditions.
shows that neither of the head procedures had many positive findings, although MRI of the head did have an increase in positive findings in the postperiod, as well as an increase in the proportion clearly related to the specific indication for doing the test. Implementation of decision support did not change the very low frequency with which either type of head HTDI test impacted patients (2.5% of the time), but the greater impact of spine MRI increased even more in the postperiod (from 14% to 30%, P = .18). Except for MRI of the spine in the postperiod, all audits revealed a high proportion of incidental findings. Spine MRI orders that did not meet appropriateness criteria greatly increased the proportion of spine MRI results that had an impact on patients (from 7% to 62%) in the postperiod. Head HTDI orders that did not meet appropriateness criteria had no impact on patients in either the pre- or postperiod (data not shown in the table).
When we combined all 3 procedures, 10% more cases had a definite rating of appropriateness after implementation and 12% more cases with a definite rating of appropriateness had an A rating. No other differences were significant between the 2 time periods, and there was no evidence of an effect on positive findings or patients.
This study showed that it is possible to use chart audits to assess the fit between recorded case information and appropriateness criteria for most cases involving these 3 procedures. It also suggests that there may have been a small increase in definite fit with the criteria and an increase in the proportion with an A rating, at least for some procedures. However, none of these increases were large or homogeneous across procedures, and for only 1 procedure (MRI of the head) was this better fit associated with an increase in the proportion of positive findings. Impact on patients was very limited for head studies, but larger for spine MRI, especially in the postperiod, suggesting there may be problems with the validity of the appropriateness criteria.
The number of HTDI studies dropped substantially for CT of the head and MRI of the spine in association with the introduction of this decision support system, whereas the volume of completed head MRIs rose slightly. This finding may reflect ordering redirection from CT to MRI. In comparison, the total number of HTDI of all types declined in Minnesota by 1.5% during this time period (based on claims data from all Minnesota payers) after increasing by 8.3% per year over the 3 years before that.
The American College of Radiology first published its criteria in 1995, and Martin et al have shown that they could be applied to 76% of the imaging procedure requests made from a general internal medicine clinic.18 The problem with these criteria is that there are a very large number of tests and possible indications and there is relatively little evidence, so the criteria are based on the opinions of expert panels.19,20 The variable results of this study for different procedures and the lack of increased positive findings or effects on patients suggest the need for more valid criteria based on evidence from more research.
Rosenthal et al17 at Massachusetts General Hospital and Sanders and Aronsky21 and Sanders and Miller22 at Vanderbilt have shown that a computerized order entry with decision support can be widely accepted by clinicians and can affect ordering behavior. On the other hand, Freeborn et al found that simply distributing guidelines for lumbar spine imaging or providing feedback about ordering patterns had no effect on use rates.23 Hadley et al reviewed charts of trauma patients with imaging studies and found that (1) it was possible to apply American College of Radiology criteria to the chart information, and (2) if the criteria had been applied to these patients, there would have been a 39% reduction in imaging costs and a 44% reduction in radiation dose.24 Bradley et al found that preevaluation MRI of chronic shoulder pain patients did not appear to have any effect on the treatment or outcome.25 Kaups et al studied repeat head CT studies on patients with head injury and found that in the absence of clinical deterioration, these studies did not alter management.26
It remains unclear whether a new requirement for fit with criteria that results in fewer imaging studies differentially reduces unnecessary tests. Although we found that relatively few of these orders had positive findings or impact on patients, further studies on larger and more varied orders with a control or comparison group are needed to verify these results and to assess the overall impact on patient care and costs.
This quasi-experimental study clearly has limitations. These include relatively small sample sizes from a single medical group and audits based on relatively subjective judgments. We also have no information about possible changes in the volume of patients requiring testing or changes in indications. However, our study suggests that similar studies can be based on chart audits and that there may be considerable variability among procedures in the extent to which decision support affects either fit with appropriateness criteria or impact on ordering volume or patient outcomes. We also clearly need a better evidence base behind the development of appropriateness criteria, with more evidence for the specific value of various combinations of imaging tests, ordering criteria, and medical conditions.
Author Affiliations: From the HealthPartners Research Foundation (LIS, FW, JCB, KJP, CAV, MAM), HealthPartners Medical Group, Minneapolis, MN.
Funding Source: This study was supported by Partnership grant 07-070 from HealthPartners Research Foundation. It originated with the HTDI Steering Committee of the Institute for Clinical Systems Improvement and its constituent health plans and medical groups, which wanted to learn the effects of their new approach to improving the quality and cost of high-tech diagnostic imaging. The Steering Committee was responsible for coordinating the implementation
of electronic health record decision support among 5 large medical groups in the state, and it provided helpful input during our frequent updates on this study and its findings.
Author Disclosure: Ms Vinz reports receiving honoraria to speak and funds to attend conferences/meetings related to decision support initiatives. The other authors (LIS, FW, JCB, KJP, MAM) report no relationship or financial interest with any entity that would pose a conflict of interest with the subject matter of this article.
Authorship Information: Concept and design (LIS, FW, JCB, KJP, CAV, MAM); acquisition of data (LIS, JCB, KJP); analysis and interpretation of data (LIS, FW, JCB, KJP, CAV); drafting of the manuscript (LIS, FW, CAV); critical revision of the manuscript for important intellectual content (LIS, FW, CAV, MAM); statistical analysis (FW); obtaining funding (LIS); and administrative, technical, or logistic support (KJP, MAM).
Address correspondence to: Leif I. Solberg, MD, HealthPartners Research Foundation, HealthPartners Medical Group, PO Box 1524, MS #21111R, Minneapolis, MN 55440-1524. E-mail: firstname.lastname@example.org.
1. Corrigan JM, Eden J, Smith BM, eds, for the Institute of Medicine. Leadership by Example: Coordinating Government Roles in Improving Health Care Quality. Washington, DC: National Academies Press; 2003.
2. Hillestad R, Bigelow J, Bower A, et al. Can electronic medical record systems transform health care? Potential health benefits, savings, and costs. Health Aff (Millwood). 2005;24(5):1103-1117.
3. Miller RH, West C, Brown TM, Sim I, Ganchoff C. The value of electronic health records in solo or small group practices. Health Aff (Millwood). 2005;24(5):1127-1137.
4. Walker JM. Electronic medical records and health care transformation. Health Aff (Millwood). 2005;24(5):1118-1120.
5. Himmelstein DU, Woolhandler S. Hope and hype: predicting the impact of electronic medical records. Health Aff (Millwood). 2005;24(5):1121-1123.
6. Goodman C. Savings in electronic medical record systems? Do it for the quality. Health Aff (Millwood). 2005;24(5):1124-1126.
7. Linder JA, Ma J, Bates DW, Middleton B, Stafford RS. Electronic health record use and the quality of ambulatory care in the United States. Arch Intern Med. 2007;167(13):1400-1405.
8. Garg AX, Adhikari NK, McDonald H, et al. Effects of computerized clinical decision support systems on practitioner performance and patient outcomes: a systematic review. JAMA. 2005;293(10):1223-1238.
9. Bhargavan M, Sunshine JH. Utilization of radiology services in the United States: levels and trends in modalities, regions, and populations. Radiology. 2005;234(3):824-832.
10. Livstone BJ, Parker L, Levin DC. Trends in the utilization of MR angiography and body MR imaging in the US Medicare population: 1993-1998. Radiology. 2002;222(3):615-618.
11. Maitino AJ, Levin DC, Parker L, Rao VM, Sunshine JH. Nationwide trends in rates of utilization of noninvasive diagnostic imaging among the Medicare population between 1993 and 1999. Radiology. 2003;227(1):113-117.
12. Rao VM, Parker L, Levin DC, Sunshine J, Bushee G. Use trends and geographic variation in neuroimaging: nationwide Medicare data for 1993 and 1998. AJNR Am J Neuroradiol. 2001;22(9):1643-1649.
13. Solomon DH, Katz JN, Carrino JA, et al. Trends in knee magnetic resonance imaging. Med Care. 2003;41(5):687-692.
14. Weiner DK, Kim YS, Bonino P, Wang T. Low back pain in older adults: are we utilizing healthcare resources wisely? Pain Med. 2006;7(2):143-150.
15. Levin DC, Rao VM, Parker L, Frangos AJ, Sunshine JH. Recent trends in utilization of cardiovascular imaging: how important are they for radiology? J Am Coll Radiol. 2005;2(9):736-739.
16. Plurad D, Green D, Demetriades D, Rhee P. The increasing use of chest computed tomography for trauma: is it being overutilized? J Trauma. 2007;62(3):631-635.
17. Rosenthal DI, Weilburg JB, Schultz T, et al. Radiology order entry with decision support: initial clinical experience. J Am Coll Radiol. 2006;3(10):799-806.
18. Martin TA, Quiroz FA, Rand SD, Kahn CE Jr. Applicability of American College of Radiology appropriateness criteria in a general internal medicine clinic. AJR Am J Roentgenol. 1999;173(1):9-11.
19. Blackmore CC, Medina LS. Evidence-based radiology and the ACR Appropriateness Criteria. J Am Coll Radiol. 2006;3(7):505-509.
20. Douglas PS. Improving imaging: our professional imperative. J Am Coll Cardiol. 2006;48(10):2152-2155.
21. Sanders DL, Aronsky D. Biomedical informatics applications for asthma care: a systematic review. J Am Med Inform Assoc. 2006;13(4):418-427.
22. Sanders DL, Miller RA. The effects on clinician ordering patterns of a computerized decision support system for neuroradiology imaging studies. Proc AMIA Symp. 2001:583-587.
23. Freeborn DK, Shye D, Mullooly JP, Eraker S, Romeo J. Primary care physicians’ use of lumbar spine imaging tests: effects of guidelines and practice pattern feedback. J Gen Intern Med. 1997;12(10):619-625.
24. Hadley JL, Agola J, Wong P. Potential impact of the American College of Radiology appropriateness criteria on CT for trauma. AJR Am J Roentgenol. 2006;186(4):937-942.
25. Bradley MP, Tung G, Green A. Overutilization of shoulder magnetic resonance imaging as a diagnostic screening tool in patients with chronic shoulder pain. J Shoulder Elbow Surg. 2005;14(3):233-237.
26. Kaups KL, Davis JW, Parks SN. Routinely repeated computed tomography after blunt head trauma: does it benefit patients? J Trauma. 2004;56(3):475-480; discussion 480-481.