Identifying Patients With Osteoporosis or at Risk for Osteoporotic Fractures
Yong Chen, MD, PhD; Leslie R. Harrold, MD, MPH; Robert A. Yood, MD; Terry S. Field, DSc; and Becky A. Briesacher, PhD
Case identification of patients at risk for an osteoporotic fracture remains a challenging problem, especially when using administrative data. Approximately 50% of osteoporosis (OP) is undiagnosed or undertreated, and testing for low bone mineral density (BMD) is often not conducted even after a fracture.1-4 For instance, over 20% of adults 65 years or older never received a BMD test or diagnosis of OP after experiencing an osteoporotic fracture, as captured in 7 years of follow-up Medicare claims data.5 A diagnosis of OP may also be underrepresented in administrative data since secondary conditions are less likely to be treated and recorded when patients have multiple chronic medical diseases.6 Furthermore, osteoporotic fracture risk encompasses more clinical factors than just the diagnosis of OP and many of these factors are not typically captured in administrative data. Weight, likelihood of falling, low BMD, family history of fracture, smoking, and alcohol use are significant predictors of osteoporotic fracture yet are unmeasured confounders in claims-based studies. As a result, characterizing patients in administrative data by their OP diagnosis and clinical risk profile for osteoporotic fracture is susceptible to misclassification error.
Despite these limitations, administrative data remain a convenient and important source for studying patient populations who might be suitable candidates for receiving OP prescriptions to prevent osteoporotic fracture. The challenge is to create a valid approach for correctly identifying the at-risk patient population, given the data limitations described above. Only 3 studies have addressed this issue in administrative data and each has limitations. One study found that a combination of diagnostic information available in administrative data and fracture claims may be used to identify individuals with OP, although that strategy has not been validated against BMD criteria.5 The other 2 studies found that prescriptions for OP drugs are a reliable marker of OP7,8 but only after using additional BMD data to identify patients with osteopenia and exclude them from the study sample.5 Both studies could not be used for identifying untreated patients since they used receipt of OP prescriptions in their case-identification algorithms.
The goal of the present study is to create and validate case-identification algorithms suitable for finding patients with OP or low BMD who may require OP prescriptions. The algorithm must use only data elements commonly found in administrative data, yet not incorporate the receipt of OP prescriptions. Furthermore, the algorithm must incorporate clinical risk factors for OP fracture as defined by the World Health Organization (WHO) Fracture Risk Assessment (FRAX) Tool clinical risk factors (CRFs).
METHODS Data Source and Study Population
We used claims data for approximately 100,000 members of a managed care plan who received their care from a multispecialty group practice in central Massachusetts (between January 1, 2008, and December 31, 2009). The database captures information on demographics, enrollment, outpatient encounters, hospitalizations, diagnostic and therapeutic procedures, and dispensings of prescription medications. The claims data were linked to a clinical data set that contained results of BMD testing.
We first selected women who received a BMD test or had a diagnosis of OP (n = 5340). We included women 50 years or older who had continuous enrollment in the managed care plan for 2 years and who had a BMD test in 2009. Since the BMD test is not usually conducted every year, a patient with a BMD test in 2009 is not likely to have another BMD in 2008. We excluded women who had less than 1 year of enrollment before the BMD test.
The BMD was examined using the dual energy x-ray absorptiometry (DXA) scan (Hologic, Waltham, Massachusetts) performed in the group practice’s facility. The standard to make the diagnosis was based on the T-score at the lumbar spine, femoral neck, or total hip. We first adopted the WHO criterion, T-score <–2.5, as the gold standard to define OP.9 However, in actual clinical practice, treatment decisions are based not only on T-score but also risk factors for osteoporotic fracture.10 For instance, a physician may prescribe OP treatment for a woman with a T-score of –2.0 if she presents with 1 or more osteoporotic fracture risk factors. Therefore, we also applied an expanded diagnostic criterion, T-score <–2.0.
We identified clinical risk factors for osteoporotic fracture based on the WHO FRAX tool.11 The FRAX tool, introduced in 2008, calculates the 10-year probability of a major osteoporotic fracture and of a hip fracture using easily obtained CRFs and bone density information when available. 12 The CRFs in the FRAX tool include weight, height, history of fracture, parental history of hip fracture, current tobacco smoking, ever long-term use of an oral glucocorticoid, rheumatoid arthritis (RA), other cause of secondary OP, and daily alcohol consumption of 3 or more units. Most administrative data sets do not include height, weight, or family history. However, we were able to identify 5 CRFs or proxies of CRFs including history of an osteoporotic fracture, use of an oral glucocorticoid medication, and diagnosis of RA. We used diagnosis of chronic obstructive pulmonary disease (COPD) as a proxy for smoking, and diagnosis of alcoholism as a proxy for current daily consumption of 3 or more units of alcohol.13
Only risk factors identified during the period before the BMD test (average 1.5 years) were included. History of an osteoporotic fracture was defined as having a fracture of the wrist, hip, or proximal humerus and was identified using International Classification of Diseases, Ninth Revision, Clinical Modification (ICD-9-CM) and CPT codes, as developed by previous research.14,15 Glucocorticoid use was defined as receiving oral prednisone, methylprednisolone, hydrocortisone, cortisone, prednisolone, triamcinolone, dexamethasone, or budesonide for at least 90 days. We included oral glucocorticoids identified using the national drug code. A patient was considered as having RA if she had a diagnostic code of 714.xx. Smoking status was not readily identified from administrative data. Therefore, we adopted a previously used method that used COPD diagnosis as a proxy for smoking13 and COPD was identified using diagnostic codes.16,17 The markers for alcoholism included ICD-9-CM code 303. xx18 for alcohol dependence syndrome and 305.0x for nondependent alcohol abuse.
Algorithms for OP Classification and Possible Error Correction
We developed 6 case-identification algorithms (Table 1) that included number of OP diagnoses, setting of OP diagnosis (inpatient or outpatient), and timing of OP diagnosis relative to BMD test. The first 3 algorithms were based on diagnostic codes of OP from outpatient encounters and hospital claims: (1) at least 1 ICD-9-CM code for OP in outpatient encounters or hospital claims, (2) at least 1 code for OP in hospital claims or at least 2 in outpatient encounter claims, and (3) at least 2 codes for OP regardless of setting (Algorithms 1 to 3, Table 1).
We then repeated the 3 algorithms above for the sample after possible error correction (Algorithms 4-6, Table 1). A patient may be tentatively diagnosed as having OP as a justification for a BMD test, but the BMD test result may actually rule out this diagnosis. Therefore, we corrected for this possible error by removing the OP diagnosis if (1) all of the ICD-9 codes for OP appeared before the BMD test, and (2) the patient had at least 3 months’ enrollment after the BMD test.
We described the distribution of patient characteristics stratified by BMD test results (T-score >–2.0, –2.5 < T-score <–2.0, and T-score <–2.5). We compared means using analysis of variance (ANOVA) and proportions using χ2 test or Fisher’s exact test. We validated the performance of our OP classification algorithms against the WHO standard, T-score –2.5 or lower, and then an expanded diagnostic criterion, T-score –2.0 or lower. We calculated sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), and area under the receiver operating characteristic curve (AUC) for each algorithm. Finally, we applied the same algorithms to the high-risk population, patients with at least 1 of the 5 FRAX risk factors that we were able to identify. Analyses were performed with SAS 9.2 (SAS Institute, Cary, North Carolina).
This study was approved by the Saint Vincent Hospital/ Fallon Clinic/Fallon Community Health Plan Institutional Review Board, Worcester, MA.
There were 5340 patients who had at least 1 ICD-9-CM diagnosis of OP or BMD test results between January 1, 2008, and December 31, 2009. We excluded 733 patients who were not continuously enrolled. We further excluded 1987 patients who did not receive a DXA test. The study sample included 2520 patients (Figure).
Table 2 describes characteristics for the study population (n = 2520) stratified by BMD test results. We found that 25.7% (n = 647) had a Tscore <–2.5 at 1 or more sites measured (lumbar spine, femoral neck, or total hip) and 25.0% (n = 631) had at least 1 of the FRAX risk factors for osteoporotic fracture as measured in our study. Compared with women with T-score greater than –2.5, those with T-score less than or equal to –2.5 were older (T-score <–2.5 vs T-score between –2.5 and –2.0; vs T-score >–2.0; age 74.3 vs 71.2 vs 67.6 years; P <.001). Women with T-scores less than or equal to –2.5 also had more fracture risk factors identified in administrative data. For instance, about 29.5% of women with a T-score less than or equal to –2.5 had more than 1 risk factor compared with 25.5% and 22.4% for women with a T-score from –2.5 to –2.0 and women with a T-score greater than –2.0, respectively.
Table 3 shows the sensitivity, specificity, PPV, NPV, and AUC of each algorithm when the diagnostic criterion was a T-score less than or equal to –2.5 (scenarios 1 and 2). We found in scenario 1 (the total study population, n = 2520) that sensitivity varied from 34.9% (algorithm 6) to 80.4% (algorithm 1); specificity from 65.3% (algorithm 1) to 92.8% (algorithms 5 and 6); PPV from 43.1% (algorithm 4) to 63% (algorithm 5); and AUC from about 0.636 (algorithm 4) to 0.737 (algorithm 2). After restricting to women with at least 1 FRAX risk factor (scenario 2, n = 631), sensitivity, specificity, PPV, NPV, and AUC changed only slightly. Sensitivity ranged from 38.7% to 83.2%; specificity from 59.1% to 90.5%; PPV from 46.9% to 64.1%; and AUC from 0.646 to 0.718.
Table 4 shows the sensitivity, specificity, PPV, and NPV when the diagnostic criterion was a T-score less than or equal to –2.0 (scenarios 3 and 4). Compared with the statistics in scenario 1, sensitivity (ranging from 22.8% to 63.1%) decreased; specificity changed slightly (ranging from 71.5% to 95%); and PPV improved substantially, varying from 66.5% to 83.1% in scenario 3 (the total study population, n = 2520). After restricting to the population with at least 1 of the 5 FRAX risk factors we identified (scenario 4), we found that sensitivity was still lower than that in scenario 1 and specificity changed slightly. However, the PPV was the highest among all 4 scenarios, ranging from 70.6% to 85.0%.
To our knowledge, our study is the first to test the validity of identifying patients who have OP or who have low BMD (osteopenia) and additional risk factors for osteoporotic fractures, using algorithms applied to administrative data in a US population. We used the results of BMD tests as the diagnostic standard and assessed the performance of our algorithms using a strict OP criterion (T-score <–2.5) and an expanded criterion (T-score <–2.0) in a population with at least 1 of the 5 FRAX risk factors we were able to identify in administrative data. Our work is also unique in that we did not require information about use of OP medications, and thus our methods are suitable for identifying the at-risk patient population before OP intervention. However, it is possible prescription claims are the only source to identify OP for some patients. In our study, we found that 489 patients had a bisphosphonate prescription (alendronate or risedronate) before the BMD test. We conducted a sensitivity analysis incorporating 2 commonly used bisphosphonates, alendronate and risedronate, into our algorithms and found the results of case-identification remained similar (data not shown).
Our study is also unique in that we incorporated 5 CRFs or proxies of CRFs from the WHO FRAX tool (RA, previous fracture, alcohol use, smoking, and history of treatment with oral glucocorticoids) into the risk assessment. This expansion incorporates current clinical recommendations that treatment to prevent osteoporotic fractures may also be appropriate for patients with a T-score better than –2.5 if other risk factors are present.10,12 In fact, we found that FRAX information did improve predictive power when the diagnostic criterion was a T-score less than or equal to –2.0, but not for the scenarios where the diagnostic criterion was a T-score less than or equal to –2.5. This suggests that incorporating FRAX risk factors may improve more of the performance of case-identification algorithms for osteopenia rather than OP using administrative data. Furthermore, other predictors, such as age, may improve the performance of the study algorithms because older age is= associated with OP. However, when we stratified the sample by age (>65 or <65 years of age), we found the algorithms performed similarly between the stratified sample and unstratified sample. Therefore, we report the results without stratification. A possible reason for this observation is that we have an older population with median age of 70 years and 25th percentile of 63 years.
We show that our algorithms achieved reasonable sensitivity, specificity, PPV, and AUC (sensitivity ranged from 34.9% to 80.4%; specificity from 65.3% to 92.8%; PPV from 43.1% to 65.3%; and AUC from 0.636 to 0.737, as shown in Table 2). These ranges are consistent with those found in previous research using prescription drug data from Canada (sensitivity ranged from 34.1% to 93.3%; specificity from 50.8% to 91.4%; PPV from 48.1% to 64.9%; and AUC from 0.627 to 0.7547). However, we did not find a single algorithm that surpassed the others. Different algorithms offered different advantages, depending on the objectives for case identification. For instance, if we want to estimate the prevalence of OP in the population, our simplest algorithm of requiring only 1 diagnostic code from any setting (eg, algorithm 1, scenario 1) should provide a reasonable estimate due to high sensitivity. If, however, the purpose of case identification is to reliably find possible candidates for OP treatment, then algorithm 1 in scenario 4 is more appropriate due to the high sensitivity and PPV.
Our algorithms also have other implications. First, the algorithms can be used to identify untreated patients in claims data with OP or low BMD who may need OP intervention. Second, when we required a diagnosis code of OP after a DXA test for case identification, the specificity of the algorithms increased (scenario 1, algorithms 4-6 vs 1-3 [Table 3]). However, we found this resulted in decreased sensitivity. A potential reason for this may be that physicians failed to code for the OP diagnosis because it may be considered as having lower priority than other health conditions. Further studies are warranted to understand the under-documentation of OP.
We found a large number of patients with a diagnosis of OP in their administrative records who did not undergo a BMD test during the study observation period (n = 1987, in the Figure). When we compared the characteristics of patients with and without a BMD, we found that patients without a BMD test were generally older and had more FRAX risk factors than those who did not undergo a BMD test. Possible reasons for these differences may be: (1) patients without a BMD test were tested and diagnosed before the study period, or (2) testing for OP may not be a priority in patients who are older and already have clinical risk factors of osteoporotic fractures.
Our study has some limitations. First, although our estimates of sensitivity, specificity, and PPV are similar to a previous study in Canada, our samples are from 1 managed care plan in the United States and may not be generalizable to other populations.7 Second, we only included patients who had evidence of a DXA bone density test to confirm the presence of OP. Therefore, the study results cannot be generalized to patients who have OP but are never tested for it. Third, some of the FRAX risk factors were more amenable to being identified in administrative data than others. Thus, we underestimated some individual clinical risk factors in our population (eg, alcoholism as a proxy of alcohol use) and could not identify others (eg, height and weight). There also may be misclassification in the identification of RA patients since we only required 1 diagnostic code. We also had incomplete information on prior fractures because we only included data 1.5 years prior to the bone density test. Fourth, the baseline period prior to the bone density test varied from 1 year to 2 years, which might result in bias because patients with a longer baseline period would be more likely to have a risk factor identified. In order to address this issue, we conducted a sensitivity analysis using an equal 1-year baseline period for all patients. We found the number of patients with at least 1 fracture risk factor reduced from 631 to 541. However, classification statistics (sensitivity, specificity, and PPV) did not significantly change (data not shown). Fifth, we considered a woman with at least 1 risk factor as being at high risk for fracture. However, the FRAX includes 10 risk factors and a patient may need more than 1 to be considered as being at higher risk by FRAX calculation. Sixth, our algorithms could not overcome important problems noted in the course of conducting the study—that is, inconsistent use of an OP diagnosis in relation to BMD testing. For example, approximately 20% of the study population who underwent BMD testing and had a T-score less than or equal to –2.5 did not have any OP diagnosis in their administrative records, and 15% of the study population who had a T-score >–2.5 had an OP diagnosis after the BMD. Lastly, there was no diagnostic code in our data set for identifying low BMD (osteopenia).
In conclusion, using administrative data only, we were able to validly identify patients with OP or those with low BMD at increased risk for osteoporotic fracture using algorithms based on diagnoses and clinical risk factors only. Our algorithms may therefore be useful in estimating the prevalence of OP and identifying untreated patients who could benefit from OP treatment.