The American Journal of Managed Care

Positive Predictive Values of ICD-9 Codes to Identify Patients With Stroke or TIA | Page 2

Published Online: February 26, 2014
Kari L. Olson, BSc(Pharm), PharmD; Michele D. Wood, PharmD; Thomas Delate, PhD; Lisa J. Lash, PharmD; Jon Rasmussen, PharmD; Anne M. Denham, PharmD; and John A. Merenich, MD
The impact of using study ICD-9 codes in conjunction with health service indicators was determined by calculating PPVs for each ICD-9 code individually and in combination with various health service indicators. To accomplish this, only patients who were exposed to the health service indicator and had the ICD-9 code diagnosis were used to calculate the PPV. To assess the impact of ICD-9 code position (primary or secondary), secondary position inpatient codes and patients with only a secondary position inpatient code were removed and PPVs recalculated. To assess the impact of including TIA, all patients with confirmed TIA were removed and PPVs recalculated. To assess the impact of the ICD-9 code 436 not being inclusive of stroke, because the code excludes “cerebrovascular accident (CVA) NOS, Stroke” as of October 1, 2004, the PPVs of code 436 recorded before and on/after this date were recalculated. To assess the impact of the use of codes with specific “infarction” terms, codes 433.XX and 434.XX were categorized individually by codes that contain the infarction term (ie, 433.X1 and 434.X1) or not and the PPVs were recalculated. Positive predictive values, were determined with SAS version 9.1.3 (SAS Institute Inc, Cary, North Carolina) using Proc Freq with weighting by the count of patients having a specific diagnosis and the exact binomial function to determine 95% CIs.

Patient characteristics were reported as means with standard deviations for interval-level characteristics. These characteristics were assessed for distribution normality and appropriate tests (eg, t test, rank-sum test) were used to assess differences between groups. To assess differences in proportions between groups on dichotomous characteristics, Pearson’s x2 test of association was utilized. A 2-sided alpha level was set at <.05.


A total of 4689 patients with 10,376 unique study administrative ICD-9 codes were reviewed. Of these, 2785 (59.4%) patients had a cerebral event confirmed by EMR review. The majority of patients had ICD-9 codes from the outpatient setting (82.6%) while 1.3% and 16.1% were from inpatient and both outpatient and inpatient settings, respectively (Table 1). The most commonly identified cerebral event types were non-cardioembolic strokes (34.8%) and TIAs (31.1%). Cerebral event type was unknown in 15.4% of cases. Patients with confirmed cerebral events had a higher mean count of unique ICD-9 codes, were slightly older, more likely to have purchased a prescription antiplatelet drug, and more likely to have had CT or MRI imaging. Positive predictive values for “intracerebral hemorrhage”( 431), “acute but ill-defined cerebrovascular disease” (436), and “personal history of stroke” (V12.54) were greater than 90% when recorded in both the inpatient and outpatient settings but identified small numbers of patients; thus, associated 95% CIs were wide (Table 2). “Occlusion of cerebral arteries” (434) recorded in either the inpatient-only or in both the inpatient and outpatient settings also achieved PPVs greater than 90% and identified large numbers of patients. Codes 434 and V12.54 in the outpatient-only setting identified the most patients and had reasonably high PPVs (both 89%). “Other and unspecified intracranial hemorrhage” (432) and “other and ill-defined cerebrovascular disease” (437) performed poorly regardless of setting (PPVs <50%). Overall, codes recorded in both the inpatient and outpatient settings yielded the highest PPVs but identified fewer patients compared with those recorded only in the inpatient or outpatient settings. Hemorrhagic stroke codes identified fewer patients than ischemic stroke codes (Table 3). Hemorrhagic stroke codes recorded only in the inpatient setting had higher PPVs than outpatient-only codes. Overall, ischemic codes tended to have higher PPVs than hemorrhagic codes (Table 4). Approximately 15% of code 436 patients had their code recorded on/after October 1, 2004, and the vast majority of these (99%) were in the outpatient setting. Nevertheless, there was no appreciable change in PPV before and after 436’s coding modification. Code 433.X1 with specific mention of infarction did increase the PPV appreciably but only approximately 8% of patients with a 433.XX code had a 433.X1 code. Code 434.X1 with specific mention of infarction did not alter the PPV appreciably over presence of the code 434.XX (Table 2).

In general, inclusion of health service indicators and exclusion of secondary inpatient diagnoses and TIA patients identified fewer patients, widened 95% CIs and did not change the PPVs substantially. Overall, inclusion of diagnostic imaging did not change PPV estimates, nor did inclusion of neurology visits. Inclusion of prescription antiplatelet exposure slightly improved PPVs (mean difference in PPVs across care settings = 1.5 [± 4.2], median difference in PPVs across care settings = 1.0). However, for codes 434 and V12.54, inclusion of prescription antiplatelet exposure resulted in more marked improvement in PPV (mean = 8.6 [± 9.7], median = 4). Removing secondary position inpatient codes did not affect the PPVs appreciably (mean = –0.5 [± 3.0] and median = 0), but removing patients with a confirmed TIA (code 435) slightly reduced PPVs overall (mean = –2.0 [± 2.5] and median = –1).


Disease registries provide opportunities for health systems to improve management of patients with chronic disease states. Initial patient identification using coded administrative data is an important part of developing a validated patient registry, particularly if the codes have high PPV. Using a standardized abstraction tool adapted from the Rochester Minnesota Stroke study form,18 we found only 60% of patients identified from administrative data using cerebrovascular ICD-9 codes had a confirmed cerebral event. We found that the settings where ICD-9 codes were recorded influenced both the accuracy of diagnosis and yield of identified cases. Codes recorded in both inpatient and outpatient settings had higher absolute PPVs, but identified fewer patients than codes recorded in only 1 of these settings. Attempts to improve the accuracy of ICD-9 codes through various combinations with health services indicators produced, at best, only moderate improvements, with the exception of combining purchases of prescription antiplatelets with codes 434 and V12.54. The incorporation of the setting where the ICD-9 codes were recorded and using combinations health service indicators are unique aspects of our study. Nevertheless, in most cases, our PPV estimates were associated with wide 95% CIs despite a relatively large sample size suggesting that administratively coded data elements may lack sufficient accuracy to be relied on without confirmation of cerebral events via medical record review.

Several inpatient studies have evaluated the accuracy of ICD-9 codes for identifying ischemic stroke by assessing the sensitivity, specificity, and/or PPV of ICD-9 diagnosis codes 430 to 438.4-7,9-16 These studies also demonstrated less than optimal accuracy of ICD-9 codes in identifying confirmed stroke patients. One study reported a PPV of only 47% for ICD-9 codes between 430 and 438 for correctly identifying incident stroke events.5 Additionally, several studies have revealed that registries derived from hospital discharge codes overestimate stroke.5,6,16 While our use of ICD-9 codes recorded in inpatient and/or outpatient settings appears to modestly improve the accuracy of identifying confirmed cerebral events, the use of ICD-9 codes alone appears to lead to a high percentage of false-positive diagnoses, such that about 40% of events identified with these commonly used ICD-9 codes are not confirmed cerebral events.6

Benesch and colleagues found that limiting inpatient ICD-9 codes to those listed in the primary discharge position increased stroke PPV.10 We hoped to increase event capture by using ICD-9 codes recorded in the primary or secondary discharge positions. Nevertheless, we also performed a subanalysis using only the primary position and found that removing secondary position codes did not decrease our inpatient PPVs. Our results may have differed from their study since we had relatively low rates of false positive strokes.

We reported PPV estimates for codes recorded only in the outpatient setting, capturing patients who may or may not have been hospitalized for treatment prior to enrolling in our health plan. We found that considerably more patients were identified in the outpatient setting. Similar to prior studies, we found that ICD-9 codes 434 and 436 had high PPVs for patients with confirmed ischemic stroke.10,11 We were able to show that the PPVs for these codes slightly improved when the codes were recorded in both the inpatient and outpatient settings (97% and 93%, respectively). Since October 1, 2004, when 436 coding changes were implemented, the number of patients with this code decreased considerably, making the utility of this code to identify patients with stroke less robust.

PDF is available on the last page.

Issue: February 2014
More on AJMC.COM