A User's Guide to the Disease Management Literature: Recommendations for Reporting and Assessing Program Outcomes

The American Journal of Managed Care, February 2005, Volume 11, Issue 2

Recently there has been tremendous growth in the number oflay-press articles and peer-reviewed journal articles reportingextraordinary improvements in health status and financial outcomesdue to disease management (DM) interventions. However,closer scrutiny of these reports reveals serious flaws in researchdesign and/or analysis, leaving many to question the veracity of theclaims. In recent years, there have been numerous contributions tothe literature on how to assess the quality of medical researchpapers. However, these guidelines focus primarily on randomizedcontrolled trials, with little attention given to the observationalstudy designs typically used in DM outcome studies. As such, generalguides to evaluating the medical literature are inadequate intheir utility to assist authors and readers of DM outcomes research.The purpose of this paper is to provide authors with a clear andcomprehensive guide to the reporting of DM outcomes, as well asto educate readers of the DM literature (both lay and peerreviewed) in how to assess the quality of the findings presented.

(Am J Manag Care. 2005;11:113-120)

Until recently, disease management (DM) haslargely been able to avoid scrutiny of its methodsfor assessing effectiveness in attaining positivehealth and financial outcomes. Unfortunately, this hasled to the reporting of incredible achievements in thelay and industry press that have left many questioningthe veracity of these claims.1-5

Similarly, DM outcome studies in peer-reviewed literaturehave reported extraordinary results, at times asa consequence of poor study design. Some basic issuesinclude the use of a pre-post study without a controlgroup, or the misguided application of a more robustdesign; not addressing biases that may threaten thevalidity of the results; inadequate description ofresearch methods or characteristics of the population;and inappropriate use or lack of statistical analysis.6-14Attempts have been made recently to address theseshortcomings publicly.15-18 However, a more methodicalapproach to both designing and reviewing DM evaluationand research studies is needed.

Journal of the American Medical


In recent years there have been numerous contributionsto the literature on how to assess the quality ofmedical research papers, with perhaps the mostnotable among them being a series of articles appearingin the 19-50 However, these guidelines focus primarilyon randomized controlled trials (RCTs), withlittle attention given to the observational study designstypically used in DM outcome studies. As such, generalguides to evaluating the medical literature are inadequatein their utility to assist authors and readers ofDM outcomes research. Therefore, the purpose of thispaper is to provide authors with a clear and comprehensiveguide to the reporting of DM outcomes, as wellas to educate readers of the DM literature (both lay andpeer reviewed) in how to assess the quality of the findingspresented.


study design


Fundamentally, the objective of any thoughtfulcritique is to ascertain whether the reported resultsrepresent an unbiased estimate of a treatment effect, orwhether they were influenced by factors other than theintervention. To make this determination, one mustconsider 2 major elements in any evaluation or researchstudy; the and the performed onthe data. The Figure presents a framework for assessingthe quality of the study design and analysis used inRCTs and observational studies. As shown, manymethodological issues overlap, while others are specificto the given design category. The items are ordered temporallyto coincide with each phase of the study.

Study Design



The 2 predominant categories of study design relevantto DM research are (better known asthe RCT) and (generally referred toas an observational study design).51-53 The most basicdifference between these 2 categories lies in how subjectsare assigned to the study. As the name implies, inthe RCT, individuals are randomly assigned either to atreatment or a control group, thereby giving each personan equal probability to be chosen for the intervention.Conversely, in an observational study design,eligible individuals are not randomly assigned to thetreatment or thecontrol group. InDM, patients and/ortheir physicians arecommonly allowedto decide who willparticipate in programinterventions.This type of assignmentprocessresults in a nonrandomdistribution ofindividuals to theintervention andnoninterventiongroups.

The value of arandomized assignmentprocess is thatall variability is distributedequally betweenthe twogroups.54 Variabilitycomes in 2 forms:observed and unobserved.Observedcovariates are characteristicsthat canbe measured by theanalyst via sourcessuch as claims, medicalrecords, memberfiles, or surveyreports; and unobservedcovariates areall other characteristicsnot captured orrecorded. Althoughobserved covariatesare used for ensuringthat subjects in the 2groups are similar onbaseline characteristics(eg, age, sex, diseasestatus), it is leftto the process of randomization to ensure that unobservedcharacteristics are similar in both groups as well.Observational study designs are susceptible to bias preciselybecause they cannot control for unobservedcovariates, and therefore cannot provide unbiased estimatesof treatment effect.

The launching point for any study regardless of designcategory (RCT or observational) is a definition of thestudy population. It is important that the individuals eligiblefor inclusion in the study be representative of thepopulation to which the findings will be applied. Forexample, many studies exclude women, the elderly, orindividuals with multiple illnesses.55 This obviously limitsthe ability to generalize results across patients outsideof this study population.56 In DM, programparticipants typically are not representative of the generalpopulation with the disease. By design, programadministrators target those patients who are either thesickest or at the highest risk of utilizing services.Therefore, it is important for the researcher and thereader of the DM outcomes literature to recognize thelimitations of generalizability of the study findings. Agood definition of the study population would include adescription of the inclusion/exclusion criteria, and clinical/demographic characteristics of both the treatmentand control groups.

The second attribute of study design to consider isthe process by which individuals engage in either thetreatment or nontreatment group. Strict adherence tothe assignment process is absolutely crucial in an RCT.As stated earlier, the basic tenet behind randomizationis that it distributes unobserved variation evenlybetween groups. Imagine if the assignment processallowed a patient's physician to determine study participation.Bias would be introduced if that physicianrelied on personal judgment to determine whether thepatient should or should not be included. Studies inwhich the process of random assignment was inadequatelydescribed or not described at all have beenshown to exaggerate the size of the observed treatmenteffect.57,58 In observational studies (DM studies in particular),assignment is usually determined through self-selection.Individuals eligible for the study or programintervention are invited to participate. The factorsdetermining why a given individual chooses to participatewhile another individual does not are at the crux ofthe issue that differentiates RCTs from observationalstudies. It has been well demonstrated59,60 that myriadfactors (eg, belief systems, enabling factors, perceivedneed) help explain why and how individuals accesshealthcare and perform health-related behaviors.Collecting as much information as possible on those eligibleindividuals who choose to participate as well as onthose who decline participation may assist theresearcher in identifying those differential characteristics.Similarly, unusual features of one group or anothermust be described for readers.

Assessing comparability between the study groupand the control group on baseline characteristics is thenext element of study design to consider. Baseline comparabilityof groups is an essential step in determining acausal link between study or program intervention andoutcome.54 Most DM programs are currently being evaluatedusing a pre-post design with no control group. Themost basic limitation of this design is that without acontrol group for which comparisons of outcomes canbe made, several sources of bias and/or competingextraneous confounding factors offer plausible alternativeexplanations for any change from baseline.61Advocates of this approach argue that most threats tovalidity are nullified by using the entire population inthe analysis.62 However, unless some basic factors arecontrolled for, such as case mix and turnover rate, biasstill remains a significant concern. Even with these controllingvariables in place, the pre-post method can beconfounded with environmental changes unrelated tothe DM program interventions.

Given these concerns, it is absolutely necessary todevelop a control group with which comparisons can bemade. Considering that DM programs or their payersare not likely to withhold potentially beneficial interventionsfrom eligible individuals by assigning them tothe control group, statistical methods can be used tomatch participants to historic controls.63 That said,some studies have included control groups in their evaluations.14 In both RCTs and observational studies, comparabilitycan only be assessed on observedcharacteristics. Therefore, it is extremely importantthat the research study include either a table or detaileddescription of demographic and clinical attributes of thetreatment and control groups. If cohorts differ onimportant observed baseline features, causal inferencesabout the program impact will be limited.

Determining whether an adequate number of individualswere included in the study is the next designfeature to review. Four interrelated parameters have aneffect on the conclusions that are attained from a typicalstatistical test64:

  • Sample size, or the number of observations, subjects,or cases under study.
  • Significance level, or alpha. This is the probabilitythat the observed result is due to chance alone.
  • Power, or the probability that a difference will beobserved when it actually occurs.
  • Effect size. This is the magnitude of change between2 groups or within 1 group, before and afterthe intervention.64

Using this logic, the sample size of the study must besufficiently large to reduce the effect size necessary todemonstrate statistically significant findings. To put itsimply, studies that use large samples require a smallereffect size to show statistical significance. Thus, from anevaluation perspective, DM programs should strive toenroll as many participants as possible and identify anequal or larger number of controls.64 Similarly, analyseson subgroups can be carried out only if their sample sizeis sufficiently large, irrespective of the overall studypopulation size. Study reports should identify the samplesize for treatment and control groups as well as forany subgroups analyzed. Significance levels for all findingsshould be clearly reported.

Study duration is the next important factor in DMprogram evaluations and is interrelated with 2 other elementsthat impact the validity of the findings:dose/response and loss to follow-up (attrition). It is generallyagreed that it takes at least 6 months after DMprogram commencement until behavioral changesbegin to take effect (dose/response). Therefore, significantchanges in healthcare utilization or monetary outcomesmay not be realized within the first year. Studiesreporting immense decreases in utilization and costs ina short-duration study (less than 1 year) must be viewedwith suspicion (especially if the study does not includea control group, or if the cohorts are not comparable atbaseline). The most likely bias in this scenario is regressionto the mean.61

Attrition from a DM program via disenrollment inarguablyimpacts results negatively. Participants who donot achieve the maximum benefit from the intervention(eg, improved self-management of their disease,improved knowledge of how to access appropriatehealth services)65 may continue to exhibit behaviorsthat run contrary to the program objective. Therefore,it is imperative that studies include a description of thepopulation that did not complete the prescribedlength/amount of treatment. Two methods that can beused to adjust for attrition are survival analysis66 andtime-series regression.67

The next important, yet often overlooked, aspect ofDM program evaluations is the intervention itself. It ismostly assumed that the treatment is robust, and thatany change noted in the outcomes are causally linkedto that treatment. However, rarely is the interventiondescribed in enough detail to allow readers to decidefor themselves if there is sufficient evidence to drawthis conclusion.65 Moreover, specific outcome measuresdirectly related to that intervention should beincluded. For example, if psychosocial models are usedto change health-related behaviors, then analysesshould be performed, and reported, to assess the relativechange in those behaviors. Without such information,the reader is left to question the causal impact ofthose interventions.

A treatment effect may or may not be evidenced,depending on the choice of outcomes. Most often in DMprogram evaluations medical cost is chosen as the primaryend point. However, cost is an ill-advised outcomevariable because it is influenced by changes in the unitcost of services, members'financial share of the medicalexpense, introduction of new technologies, and soforth—variables outside of the DM program intervention.61 It is for this reason that disease-specific utilizationmeasures should be used as indicators of programsuccess.68,69 Although rising costs may be due to manyuncontrolled-for variables, a decrease in utilization ismore likely evidence of a DM program's intervention. Bymeasuring the specific utilization variables that a DMprogram intends to impact directly, the evaluationshould draw the appropriate conclusions from the dataanalysis.

The final study design element for consideration inRCTs only is blinding the patient, provider, and analystto group assignment. Blinding alone eliminates theintroduction of several biases that may invalidate thestudy findings. Failure to use blinding in RCTs has beenshown to overstate treatment effects.58 This issue is notrelevant to observational studies since researchers havelittle control over program participation.


The second major area related to the quality of evaluationor research findings is the rigor and applicabilityof the data analysis. Accuracy of data sources is the firstpoint of concern. Most DM programs rely on largeadministrative databases (medical claims and membershipfiles) for retrieving information on diagnostic measuresto identify suitable participants, baselinecharacteristics, quality indicators, and utilization andcost values. These data sources are notoriously inaccurate.70 The influence of data inaccuracy on outcomescan be decisive. For example, in one study comparingthe ability to predict mortality after coronary arterybypass surgery, the predictive ability based on dataderived from medical records was significantly betterthan that based on administrative data.71 Therefore, adescription of how validation of data accuracy wasaccomplished must be presented in studies that rely onadministrative data for any aspect of the researchendeavor.

Next, the group on which the analysis was performedshould be clearly identified. In RCTs, it is common toassess outcomes of all participants assigned to a givencohort, as opposed to evaluating outcomes only of thosewho received the treatment. The former is called theintent-to-treat (ITT) analysis, and the latter is referredto as a treatment-received (TR) analysis. The ITT analysispreserves the value of randomization (by equally distributingobserved and unobserved covariates betweenthe cohorts); however, causal inferences can be madeonly about the effects of being assigned to a given treatment,not receipt of that treatment. This method is usefulon a policy level, where forecasts of outcomes can bemade assuming the program will be implemented on alarge-scale basis.53 In DM programs, individuals self-selectto participate in the program and thereby limitthe analysis to the TR method. Predictive risk-adjustedmodels should be used to improve the process by whichsuitable participants are identified, while establishing ameans to provide a more accurate description of eligibleindividuals. If the tool has high sensitivity (accuratelyidentifying people who meet the eligibility criteria) andspecificity (accurately identifying people who do notmeet the eligibility criteria), the researcher may feelmore confident in using the ITT method for comparingoutcomes between cohorts.

An Introduction to Medical


Medical Statistics: A Commonsense


The final step in reviewing the soundness of a study'sdata analysis is consideration of the application of statistics.A comprehensive discussion about the types ofstatistical models and analyses required to evaluate programeffectiveness is unfortunately beyond the scope ofthis paper. However, 2 books, and provide a good introduction to medical statisticsfor the interested reader with a basic understandingof the field.72,73 In DM outcome studies,multiple regression analysis almost always is required toestimate the independent effect of covariates on theoutcome and to test whether the model provides additionalprognostic value. These models also form thebasis for most risk-adjustment tools. Two importantvariables that should be included in any comparativeanalysis are severity and case-mix adjustment.74Especially in pre-post designs, tracking the population'scase mix and severity level over the course of the studywill assist in determining whether the program had atreatment effect or whether population dynamics influencedoutcomes. Several diagnostic groupers can bereadily used for this purpose (eg, diagnosis-relatedgroups, ambulatory care groups), as well more simplemethods such as counts of comorbid conditions. Thesevariables should be included in the regression model asadjusters in the assessment of a treatment effect.





Actual values and/or 95% confidence intervalsshould be stated for each outcome variable. While thisstatement may appear superfluous, many studies eitherdo not include any levels of significance, or they provideinexact measurements. For example, while the generalconsensus is to report significant values at <.05, severalstudies report values at <.10. This is potentiallymisleading to the inattentive reader, who may draw thewrong conclusions based on these values. Similarly,many journals require that researchers report exact values instead of "NS"(nonsignificant). This allowsreaders to decide for themselves how much stock to putin that actual value, as opposed to a predeterminationby the authors on their behalf.

Confidence intervals give an estimated range of valueswithin which the unknown population parametermay lie. Using the mean as an example, we can calculate,based on the sample data, an estimated range ofvalues within which we believe (with a given level ofconfidence) that the population mean may exist. Thewidth of the confidence interval generally gives us someinsight as to the accuracy of the estimate. A wide intervalmay indicate large variability in the dataset, or maybe a result of having a very small number of study participants.In cases where parametric statistics cannotprovide confidence intervals, bootstrapping is a viableand suggested option.75 When the outcome variable isdichotomous (eg, yes/no, 0/1), the proportion of individualswith the outcome should be provided, alongwith the associated odds ratios. An additional measurethat can be used to assess the effect of introducing theintervention is called the "number needed to treat"(NNT)76 The NNT provides an estimate of the number ofpatients that must be treated to prevent 1 adverse outcome.While not widely used, this may be a very suitablemeasure for assessing DM program effectiveness.

Conceptually, the basic premise of a sensitivityanalysis is that subjects in observational studies differfrom those in RCTs in their recruitment to the treatmentgroup. Although all individuals in a RCT have a50/50 chance of being assigned to the treatment group,observational studies are limited by self-selection bias.Sensitivity analysis therefore provides an estimate forhow far this bias must diverge from the 50/50 split of anRCT to raise concerns about the validity of the studyfindings (A. Linden, J. Adams, N. Roberts, unpublisheddata, 2004). Observational studies that fail to include asensitivity analysis inhibit the reader's ability to judgethe strength of the evidence that support a treatmenteffect.

The presentation of data analyses performed isessential to any research, whether it be an RCT orobservational study. Two basic tables should be commonplacein any paper. These are (1) a display of baselinecharacteristics of the groups under comparison and(2) outputs from statistical analyses, including modelparameters and estimates.


Table 1 presents a modified table from an article byLinden et al63 in which participants in a congestive-heart-failure program were compared with the entireunmanaged congestive-heart-failure population and witha control group matched on propensity score. Includedin the table are the major elements discussed in thispaper. Baseline characteristics are presented above thedotted line, and outcome measures are shown below it.Sample sizes are noted, as well as group means and standarderrors. values are noted for each pairwise comparison.Although this table is meant for illustrative purposesonly, it serves as a basic template for presenting comparisongroup characteristics in a clear and concise manner.


Table 2 presents results from a Cox-regression survivalanalysis by Linden et al66 in which age and sexappear to be significant predictors of hospitalization.Each unit increase in a patient's age was expected toincrease the risk of hospitalization by 2.6%, while beingfemale reduced the risk of hospitalization by nearly 8%.Also presented are values and 95% confidence intervals.Regardless of statistical model used in the dataanalysis, tables with a similar structure should be presentedto the reader.


This paper has provided in some detail a comprehensiveguide to the reporting of DM outcomes, includingimportant elements of both study design and dataanalysis. The information presented herein should beused as an educational tool to enable readers of the DMliterature to independently assess the quality of theresearch findings presented in the lay press and thepeer-reviewed literature. This guide also should be usedby DM researchers in developing DM evaluation plansand reporting findings. Raising the standards by whichDM program outcomes are evaluated should result inimproved quality of peer-reviewed and lay publicationson the subject, and the healthcare community's confidencein the veracity of these reports.

From Linden Consulting Group, Portland, Ore, and Oregon Health Science University,School of Medicine, Department of Preventive Health/Preventive Medicine, Portland, Ore(AL); and the Providence Health System, Portland, Ore (NP).

Address correspondence to: Ariel Linden, DrPH, President, Linden Consulting Group,6208 NE Chestnut St, Hillsboro, OR 97124. E-mail: ariellinden@yahoo.com.

Healthc Demand

Dis Manage.

1. Nurse-driven CHF program cuts hospitalization by 87%. 1997;5:78-80.

Healthc Benchmarks.

2. Humana CHF program cuts costs, admissions. 1998;5:173-175.

Dis Manage Advisor.

3. Web-based educational effort for CHF patients boost outcomes while cuttingcosts. 2001;7:92-96.

Dis Manage News.

4. Costs drop 37% in Colorado Medicaid asthma DM pilot. 2004;9(8):1, 4. 5.

Healthc Demand Dis Manag.

5. Asthma program targets patient and physician compliance; wins first DM excellenceaward. 1998;4(12 suppl 1-4):181-184.

Am J Med Qual.

6. Rothman R, Malone R, Bryant B, Horlen C, Pignone M. Pharmacist-led, primarycare-based disease management improves hemoglobin A1c in high-riskpatients with diabetes. March-April 2003;18:51-58.

South Med J.

7. Ibrahim IA, Beich J, Sidorov J, Gabbay R, Yu L. Measuring outcomes of type 2diabetes disease management program in an HMO setting. 2002;95:78-87.

Diabetes Care.

8. Erdman DM, Cook CB, Greenlund KJ, et al. The impact of outpatient diabetesmanagement on serum lipids in urban African-Americans with type 2 diabetes.2002;25:9-15.

Arch Intern Med.

9. Whellan DJ, Gaulden L, Gattis WA, et al. The benefit of implementing a heartfailure disease management program. 2001;161:2223-2228.

J Card Fail.

10. Hershberger RE, Ni H, Nauman DJ, et al. Prospective evaluation of an outpatientheart failure management program. March 2001;7:64-74.

Can J Cardiol.

11. Baillargeon JP, Lepage S, Larrivee L, Roy MA, Landry S, Maheux P. Intensivesurveillance and treatment of dyslipidemia in the postinfarct patient: evaluation ofa nurse-oriented management approach. 2001;17:169-175.

Am J Manag Care.

12. Sidorov J, Gabbay R, Harris R, et al. Disease management for diabetes mellitus:impact on hemoglobin A1c. 2000;6:1217-1226.

Am J Manag Care.

13. Jowers JR, Schwartz AL, Tinkelman DG, et al. Disease management programimproves asthma outcomes. 2000;6:585-592.

Health Aff.

14. Villagra A, Ahmed T. Effectiveness of a disease management program forpatients with diabetes. 2004;23:255-266.


Nor Laegeforen.

15. Johansen LW, Bjorndal A, Flottorp S, Grotting T, Oxman AD. Evaluation ofhealth information in newspapers and brochures. What can we believe? January 20, 1996;116(2):260-264.

Dis Manage News.

16. Cardium claims big savings, DM observers have doubts. 2002;8(3):1, 5, 6.

Dis Manage News.

17. Critics question CO Medicaid's asthma claims. 2004;9(9):2, 3, 6.

Health Aff.

18. Wilson T, Linden A. Measuring diabetes management [letter]. 2004;23:7, 8.


19. Oxman A, Sackett DL, Guyatt GH. Users' guides to the medical literature, I:how to get started. 1993;270:2093-2095.


20. Guyatt GH, Sackett DL, Cook DJ. Users' guides to the medical literature, II:how to use an article about therapy or prevention. A. Are the results of the studyvalid? 1993;270:2598-2601.


21. Guyatt GH, Sackett DL, Cook DJ. Users' guides to the medical literature, II:how to use an article about therapy or prevention. B. What were the results andwill they help me in caring for my patients? 1994;271:59-63.


22. Jaeschke R, Guyatt G, Sackett DL. Users' guides to the medical literature, III:how to use an article about a diagnostic test. A. Are the results of the study valid?1994;271:389-391.


23. Jaeschke R, Gordon H, Guyatt G, Sackett DL. Users' guides to the medicalliterature, III: how to use an article about a diagnostic test. B. What are the resultsand will they help me in caring for my patients? 1994;271:703-707.


24. Levine M, Walter S, Lee H, Haines T, Holbrook A, Moyer V. Users' guides tothe medical literature, IV: how to use an article about harm. 1994;271:1615-1619.


25. Laupacis A, Wells G, Richardson S, Tugwell P. Users' guides to the medicalliterature, V: how to use an article about prognosis. 1994;272:234-237.


26. Oxman AD, Cook DJ, Guyatt GH. Users' guides to the medical literature, VI:how to use an overview. Evidence-Based Medicine Working Group. 1994;272:1367-1371.


27. Richardson WS, Detsky AS. Users' guides to the medical literature, VII: howto use a clinical decision analysis. A. Are the results of the study valid? 1995;273:1292-1295.


28. Richardson WS, Detsky AS. Users' guides to the medical literature, VII: howto use a clinical decision analysis. B. What are the results and will they help me incaring for my patients? 1995;273:1610-1613.


29. Hayward RSA, Wilson MC, Tunis SR, Bass EB, Guyatt G. Users' guides to themedical literature, VIII: how to use clinical practice guidelines. A. Are the recommendationsvalid? 1995;274:570-574.


30. Wilson MC, Hayward RSA, Tunis SR, Bass EB, Guyatt G. Users' guides to themedical literature, VIII: how to use clinical practice guidelines. B. What are therecommendations and will they help you in caring for your patients? 1995;274:1630-1632.


31. Guyatt GH, Sackett DL, Sinclair JC, et al. Users' guides to the medical literature,IX: a method for grading health care recommendations. 1995;274:1800-1804.


32. Naylor CD, Guyatt GH. Users' guides to the medical literature, X: how to usean article reporting variations in the outcomes of health services. Evidence-BasedMedicine Working Group. 1996;275:554-558.


33. Naylor CD, Guyatt GH. Users' guides to the medical literature, XI: how to usean article about a clinical utilization review. Evidence-Based Medicine WorkingGroup. 1996;275:1435-1439.


34. Guyatt GH, Naylor CD, Juniper E, et al. Users' guides to the medical literature,XII: how to use articles about health-related quality of life. Evidence-BasedMedicine Working Group. 1997;277:1232-1237.


35. Drummond MF, Richardson WS, O'Brien BJ, Levine M, Heyland D. Users'guides to the medical literature, XIII: how to use an article on economic analysisof clinical practice. A. Are the results of the study valid? Evidence-Based MedicineWorking Group. 1997;277:1552-1557.



36. O'Brien BJ, Heyland D, Richardson WS, Levine M, Drummond MF. Users'guides to the medical literature, XIII: how to use an article on economic analysisof clinical practice. B. What are the results and will they help me in caring for mypatients? Evidence-Based Medicine Working Group [published erratum appears in1997;278:1064]. 1997;277:1802-1806.


37. Dans AL, Dans LF, Guyatt GH, Richardson S. Users' guides to the medical literature,XIV: how to decide on the applicability of clinical trial results to yourpatient. Evidence-Based Medicine Working Group. 1998;279:545-549.


38. Richardson WS, Wilson MC, Guyatt GH, Cook DJ, Nishikawa J. Users'guides to the medical literature, XV: how to use an article about disease probabilityfor differential diagnosis. 1999;281:1214-1219.


39. Guyatt GH, Sinclair J, Cook DJ, Glasziou P. Users' guides to the medical literature,XVI: how to use a treatment recommendation. 1999;281:1836-1843.


40. Barratt A, Irwig L, Glasziou P, et al. Users guide to medical literature, XVII:how to use guidelines and recommendations about screening. 1999;281:2029-2034.


41. Randolph AG, Haynes RB, Wyatt JC, Cook DJ, Guyatt GH. Users' guide tomedical literature, XVIII: how to use an article evaluating the clinical impact of acomputer-based clinical decision support system. 1999;282:67-74.


42. Bucher HC, Guyatt GH, Cook DJ, Holbrook A, McAlister FA. Users' guidesto the medical literature, XIX: applying clinical trial results. A. How to use an articlemeasuring the effect of an intervention on surrogate end points. 1999;282:771-778.


43. McAlister FA, Laupacis A, Wells GA, Sackett DL. Users' guides to the medicalliterature, XIX: applying clinical trial results. B. Guidelines for determiningwhether a drug is exerting (more than) a class effect. 1999;282:1371-1377.


44. McAlister FA, Straus SE, Guyatt GH, Haynes RB. Users' guides to the medicalliterature, XX: integrating research evidence with the care of the individual patient.2000;283:2829-2836.


45. Hunt DL, Jaeschke R, McKibbon KA. Users' guides to the medical literature,XXI: using electronic health information resources in evidence-based practice.Evidence-Based Medicine Working Group. 2000;283:1875-1879.


46. McGinn TG, Guyatt GH, Wyer PC, Naylor CD, Stiell IG, Richardson WS.Users' guides to the medical literature, XXII: how to use articles about clinicaldecision rules. 2000;284:79-84.


47. Giacomini MK, Cook DJ. Users' guides to the medical literature, XXIII: qualitativeresearch in health care. A. Are the results of the study valid? 2000;284:357-362.


48. Giacomini MK, Cook DJ. Users' guides to the medical literature, XXIII: qualitativeresearch in health care. B. What are the results and how do they help mecare for my patients? 2000;284:478-482.


49. Richardson WS, Wilson MC, Williams JW, Moyer VA, Naylor CD. Users'guides to the medical literature, XXIV: how to use an article on the clinical manifestationsof disease 2000;284:869-875.


50. Guyatt GH, Haynes RB, Jaeschke RZ, et al. Users' guides to the medical literature,XXV: evidence-based medicine: principles for applying the users' guides topatient care. 2000;284:1290-1296.

xperimental and Quasi-experimental Designs for


51. Campbell DT, Stanley JC. EChicago, Ill: Rand McNally; 1966.

Quasi-experimentation: Design and Analysis Issues

for Field Settings.

52. Cook TD, Campbell DT. Chicago, Ill: Rand McNally College Publishing; 1979.

Experimental and Quasi-experimental

Designs for Generalized Causal Inference.

53. Shadish SR, Cook TD, Campbell DT. Boston, Mass: Houghton Mifflin; 2002.

Dis Manag.

54. Wilson T, MacDowell M. Framework for assessing causality in disease managementprograms: principles. 2003;6:143-158.

N Engl J Med.

55. Healy B. The Yentl syndrome. 1991;325:274-277.

Manag Care Interface.

56. Linden A, Adams J, Roberts N. The generalizability of disease managementprogram results: getting from here to there. July 2004:38-45.


57. Schulz KF, Chalmers I, Grimes DA, Altman D. Assessing the quality of randomizationfrom reports of controlled trials published in the obstetrics and gynecologyjournals. 1994;272:125-128.


58. Schulz KF, Chalmers I, Hayes RJ, Altman D. Empirical evidence of bias.Dimensions of methodological quality associated with estimates of treatmenteffects in controlled trials. 1995;273:408-412.

Behavioral Model of Families: Use of Health Services.

59. Andersen RM. Chicago, Ill: Center for Health Administration Studies, University of Chicago;1968. Research Series No. 25.

Med Care.

60. Aday L, Andersen RM. Equity in access to medical care: realized and potential.1981;19(12 suppl):4-27.

Dis Manage.

61. Linden A, Adams J, Roberts N. An assessment of the total population approachfor evaluating disease management program effectiveness. 2003;6:93-102.

Dis Manage.

62. American Healthways and the John Hopkins Consensus Conference.Consensus report: standard outcome metrics and evaluation methodology for diseasemanagement programs. 2003;6:121-138.

Dis Manag Health


63. Linden A, Adams J, Roberts N. Using propensity scores to construct comparablecontrol groups for disease management program evaluation. In press.

Dis Manage.

64. Linden A, Adams J, Roberts N. Using an empirical method for establishingclinical outcome targets in disease management programs. 2004;7:93-101.

Dis Manage.

65. Linden A, Roberts N. Disease management interventions: what's in the blackbox? 2004;7(4):275-291.

Dis Manage.

66. Linden A, Adams J, Roberts N. Evaluating disease management program effectiveness:an introduction to survival analysis. 2004;7:180-190.

Res Healthc Financ


67. Linden A, Adams J, Roberts N. Evaluating disease management program effectivenessadjusting for enrollment (tenure) and seasonality. 2004;9(1):57-68.

Evaluation Methods in Disease Management:

Determining Program Effectiveness.

68. Linden A, Adams J, Roberts N. Washington, DC: Disease ManagementAssociation of America; October 2003. Position paper.

Dis Manage.

69. Linden A, Adams J, Roberts N. Evaluating disease management program effectiveness:an introduction to time series analysis. 2003;6:243-255.

Ann Intern Med.

70. Jollis JG, Ancukiewicz M, Delong ER, Pryor DB, Muhlbaier LH, Mark DB.Discordance of databases designed for claims payment versus clinical informationsystems: implications for outcomes research. 1993;119:844-850.

Med Care.

71. Hannan EL, Kilburn H Jr, Lindsey ML, Lewis R. Clinical versus administrativedata bases for CABG surgery. Does it matter? 1992;30:892-907.

An Introduction to Medical Statistics.

72. Bland M. 2nd ed. Oxford, UK: OxfordMedical Publications; 1995.

Medical Statistics: A Commonsense Approach.

73. Campbell MJ, Machin D. 2nded. Chichester, UK: John Wiley and Sons; 1994.

A Risk-Adjusted Method for Bed-Day Reporting at CareAmerica

Health Plans

74. Linden A. [dissertation]. University of California; Los Angeles; 1997.

Dis Manage Health


75. Linden A, Adams J, Roberts N. Evaluating disease management program effectiveness:an introduction to the bootstrap technique. In press.


76. Cook RJ, Sackett DL. The number needed to treat: a clinically useful measureof treatment effect. 1995;310:452-454.