Variations of Physician Group Profiling Indicators for Asthma Care

January 15, 2005
I-Chan Huang, PhD
I-Chan Huang, PhD

Gregory B. Diette, MD, MHS
Gregory B. Diette, MD, MHS

Francesca Dominici, PhD
Francesca Dominici, PhD

Constantine Frangakis, PhD
Constantine Frangakis, PhD

Albert W. Wu, MD, MPH
Albert W. Wu, MD, MPH

Volume 11, Issue 1

Objective: To determine how much of the variation in physiciangroup profiling for asthma care can be attributed to physiciangroups and how reliable those profiling indicators are.

Study Design: Cross-sectional study. Variations attributable tophysician groups are presented using the intraclass correlationcoefficient (ICC). The reliability of profiling results was determinedusing the ICC and sample size of the physician group.

Participants and Settings: Between July 1998 and February1999, patients with asthma from 20 California physician groupswere randomly selected to be surveyed; 2515 patients responded.

Main Outcome Measures: Quality indicators for physiciangroup profiling were (1) National Asthma Education and PreventionProgram guideline-based processes of care, including accessibilityof asthma care, self-management knowledge about asthmacare, use of inhaled bronchodilators, and use of inhaled corticosteroids,and (2) patient outcomes, including satisfaction with asthmacare, improvement in health status, and emergency departmentvisits and hospitalizations attributable to asthma.

Results: The variations attributable to physician group weresmall (< 10%) for process and outcome indicators. For process indicators,self-management knowledge had the highest ICC (9.83%),and use of inhaled bronchodilators had the lowest ICC (3.08%). Foroutcome indicators, satisfaction with asthma care had the highestICC (9.53%), and hospitalization had the lowest ICC (1.35%).Despite low ICCs, a large sample size per physician group (n = 126)yielded acceptable reliability (&#8805; 0.80) for most profiling results.

Conclusions: The selected indicators for profiling asthma careat the physician group level were generally reliable. Sampling asufficient number of cases is key to achieving useful results fromprofiling.

(Am J Manag Care. 2004;10:38-44)


for the Diagnosis and Management of Asthma

Asthma is a common disease, characterized byinflammation of airways and reversible obstructionto airflow, that affects an estimated 14.6million persons in the United States.1 In response torepeated demonstrations of suboptimal asthma treatment,the National Asthma Education and PreventionProgram (NAEPP) Expert Panel published in 1991.2The guidelines, which were revised in 1997 and 2002,emphasize the importance of patient education andappropriate use of medications.3,4

Leading quality oversight organizations assess performanceof asthma care by individual physicians,physician groups, and health plans using NAEPP-basedguidelines.5-7 The expectation is that provider profilingcan increase provider accountability to improve qualityof care, help to control healthcare costs, and guide consumersto high-quality providers.8 To date, most profilingindicators have been selected based on "clinicalimportance" and represent important processes or outcomesof care. However, little is known about whetheractual variations in these indicators are large enough todiscriminate among different providers. The amount ofvariation in provider profiles that can be truly attributedto providers, after adjusting for patient case mix,can be estimated using the intraclass correlation coefficient(ICC).9-12 For judging provider performance, it isuseful to have profiling indicators with a high ICC,implying that indicator scores tend to be similar forpatients cared for by the same provider, and providinglarger differences across providers.

However, the ICC only provides the percentage ofvariations due to provider effects and therefore cannotbe the sole indicator of performance. When the numberof patients per practice is small, even if the ICC is high,it is possible that results of profiling may be uninformative.Therefore, it is useful to apply a second indicator,the reliability of profiling, which considers the ICCtogether with the number of patients sampled from aprovider.10,11

There have been few studies of the ICC of profilingindicators. The range of variations attributable toproviders varies depending on disease and the selectionof an indicator. In general, the variations attributable toproviders are small (< 10%).10,13-17 For example, Hoferand colleagues10 assessed physician profiles for the careof patients with type 2 diabetes mellitus and found thatthe overall variance in hospitalization rates attributableto physician practice was only 1%. Krein et al16 foundthat the ICCs of diabetes process and outcome indicatorsat the primary care provider level ranged from 0%to 9%. Sixma and colleagues13 showed that the ICC ofpatient satisfaction with general practitioners was 5% to10%. A review by Campbell et al18 suggested that at theindividual practice level the ICCs of process indicatorswere higher than those of outcome indicators.

Fewer studies10,17 have examined the reliability ofprofiling indicators. Hofer et al10 suggested that the reliabilityof physician profiles for hospitalization rates forpatients with type 2 diabetes mellitus was only 0.17.Poor reliability of profiling results at the individualphysician level can be due to small panels of patients.10

This study evaluated how much of the variance ofphysician group profiling is attributed to physiciangroup effects and how reliable physician group profilingis for process and outcome indicators. If the variationattributable to physician groups and the reliability ofprofiling results are small, then current profiling practicesmay need reexamination. We used consistencywith asthma guidelines and patient outcomes as performanceindicators and the physician group as the unitfor profiling.


Study Setting

This study was conducted in conjunction with 20California physician groups that participated in the1998 Asthma Outcomes Survey. The Asthma OutcomesSurvey was initiated by the Pacific Business Group onHealth and by HealthNet to evaluate, improve, andreport on the quality of asthma care at the physiciangroup level.19

Although profiling often focuses on individual physiciansor entire organizations, experts have suggestedthat profiling of physician groups may be useful. Thereis a practical need for managed care organizations toprofile physician groups, as patients tend to select plansbased on individual physicians or physician groups.20,21Therefore, profiles at the physician group level couldenhance consumer choice. Individual groups may bemore receptive to data on their own practice ratherthan data from the entire organization. Physicians maywant to join groups that they believe deliver bettercare, and managed care organizations want to contractwith them.

Sample Selection and Data Collection

International Classification of

Diseases, Ninth Revision, Clinical Modification

Details on sample selection and data collection havebeen described.19 Briefly, the 20 participating physiciangroups used administrative materials to identify allmanaged care patients with at least 1 asthma-relatedencounter in the outpatient, emergency, or inpatientsetting (identified by code493.xx) between January 1, 1997, and December 31,1997. To reduce misclassification of chronic obstructivepulmonary disease as asthma, we restricted our subjectsto those younger than 55 years. Patients had to be continuouslyenrolled in the physician group for that calendaryear. From eligible patients, the study randomlyselected a sample of 650 patients from each physiciangroup. If a physician group had fewer than 650 eligiblepatients, then all eligible patients were sampled.

Patient data were collected by mailed survey. Theinstrument was largely based on the Health Survey forAsthma Patients developed at The Johns HopkinsUniversity for the Outcomes Management SystemConsortium Asthma Project.22-24 The survey asked aboutpatient characteristics, general health, asthma symptoms,effect of asthma on functioning, asthma medicationsand treatment, self-management knowledge andactivities, access to care, and patient satisfaction. Thesurvey was fielded between July 1998 and February1999. A total of 2515 responses were obtained, for aresponse rate of 32.2%.

Performance Indicators

In this study, we selected processes of asthma careand patient outcomes as indicators for publicly reportedphysician group comparisons. We evaluated thescore variability and reliability for those indicators.

Processes of care were assessed by consistency withthe NAEPP asthma guidelines, including accessibility ofasthma care, self-management knowledge about asthmacare, use of inhaled bronchodilators, and use of inhaledcorticosteroids. Access to asthma care included accessibilityof clinicians by telephone, for appointments, andto get asthma medications. Self-management knowledgemeasured ability to manage asthma flares, appropriatelyadjust asthma medication, and identify asthma triggers.For medication use, the NAEPP guidelines advocateinhaled corticosteroids as the most consistently effectivelong-term control medication and recommend inhaledbronchodilators (or &#946;2-agonists)as rescue medications.3In the survey, patients were rated on the number of puffsof inhaled bronchodilators and inhaled corticosteroidsused every day. We dichotomized responses for inhaledbronchodilator use into 8 puffs or fewer as "no overuse"and more than 8 puffs as "overuse, "and inhaled corticosteroid use into 4 puffs or fewer as "underuse" andmore than 4 puffs as "no underuse," based on guidelinesand recommendations before the NAEPP guidelineswere updated in 2002.23

Outcome measures included satisfaction with asthmacare during the past week, improvement in healthstatus during the past week, and emergency departmentvisits and hospitalizations attributable to asthma duringthe past year. We dichotomized responses on patientsatisfaction into "greater satisfaction (excellent or verygood)" vs "less satisfaction (good, poor, or fair), "improvementin health status into "greater improvement(much better or somewhat better)" vs "less improvement(about the same, somewhat worse, or muchworse)," and emergency department visit and hospitalizationinto "no visit"vs "visits 1 or more times" and "nohospitalization" vs "hospitalizations 1 or more times."

Risk Adjustment

Candidate risk-adjustment variables were collectedfrom the patient survey, including patient age, sex, educationlevel, type of health insurance, asthma severity,number of asthma-related comorbidities, and healthstatus (Medical Outcomes Study 36-Item Short-FormHealth Survey physical component score and mentalcomponent score). Asthma-related comorbidities includedrhinitis, sinusitis, chronic bronchitis, heartburn(gastroesophageal reflux), emphysema, and congestiveheart failure. The study measured asthma severityusing responses to several questions to approximatethe NAEPP's 4 severity strata (mild intermittent, mildpersistent, moderate persistent, and severe persistent).3 We measured severity using patients' reports ofthe frequency of symptoms (cough, sputum, wheezing,chest tightness, and shortness of breath), the frequencyof nocturnal symptoms, and the chronicity ofsymptoms between attacks. Severity was determinedby the greatest severity in the responses to any ofthese questions.19,23

Statistical Analysis



&#967;2 and tests were used to identify bivariate relationshipsbetween performance indicators and candidaterisk-adjustment variables. We selected risk-adjustmentvariables that were statistically significant (< .05) forinclusion in multivariate risk-adjustment models. Weincluded all asthma patients to calculate the ICCs of profilingindicators. However, based on recommendations ofthe 1997 NAEPP guidelines, we only included asthmapatients who had moderate persistent and severe persistentseverity for the inhaled corticosteroid use indicator.3

We used Bayesian hierarchical modeling (HM) toquantify variations of performance indicators across the20 physician groups that were attributed to physiciangroups. The use of Bayesian HM is regarded as a moreappropriate approach than conventional approaches, asit takes into account the statistical uncertainty of eachgroup-specific performance and the natural heterogeneityof the true group-specific performances, a key sourceof uncertainty of these analyses.25 The major advantageof HM is that it allows us to assess physician group performanceby quantifying random intercepts of logisticregressions at the patient level.25,26 Most important, HMcan appropriately partition variations of performancemeasures across physician groups into between-physiciangroup variability and within-physician group variability,and the variance estimates can then be used toproduce "shrunken"estimates that are better estimatesof the group effects.12,27 Estimates for groups with smallcase numbers are more likely to shrink toward the grandmean than those for groups with large case numbers.

The percentage of variability attributable to physiciangroup effects relative to the overall residual variabilitycan be estimated using the ICC as follows9-12:

We adopted the method by Turner et al28 to calculateICCs for binary performance indicators. The estimationsof ICCs under Bayesian HM were carried out usingMarkov chain Monte Carlo simulation. We used a uniformprior for the ICC estimation because thus far thereis not much information available for the study ofprovider profiling. Markov chain Monte Carlo simulationcomprised a burn-in of 500, followed by a further5000 iterations, during which the posterior distributionof ICCs was monitored.29

We calculated the reliability by combining theinformation of the ICC and the mean sample size (n)across the 20 physician groups using the followingequation10,11:

Based on the second equation, we can further calculatethe required sample size based on the expectedreliability of profiling results. To date, there is noagreed standard for judging the reliability of physiciangroup profiling. Most research suggests that thereliability should be 0.80 or better.10 Another sourcesuggests that acceptable reliability is at the level of0.70.30 Stata 7 (StataCorp LP, College Station, Tex)was used for bivariate analyses and WinBUGS 1.329 forICC calculation.


Characteristics of Physician Groupsand Respondents

Table 1 gives the characteristics of the participatingphysician groups. Ten physician groups (50.0%) wereclassified as medical groups, 7 (35.0%) were independentpractice associations, and 3 (15.0%) were foundationor community clinics. All of the 20 weremultispecialty groups. Table 2 gives the characteristicsof the 2515 asthma patients included in the study.Patients ranged in age from 18 to 56 years (mean + SD,39.9 + 9.5 years), 71.2% were female, 70.3% were whiteand 5.1% African American, and 81.6% had at least somecollege education. In terms of clinical characteristics,14.4% had mild intermittent asthma, 19.2% had mildpersistent asthma, 49.3% had moderate persistent asthma,and 17.1% had severe persistent asthma. The mean+ SD number of comorbidities was 2.1 + 1.4.

Variations in Performance IndicatorsAttributable to Physician Groups

Table 3 gives the proportion, ICC, and reliability foreach performance indicator. The ICCs using BayesianHM indicated that, for process and outcome indicators,variations attributable to physician groups were small (<10%). Indicators of guideline consistency demonstratedslightly higher ICCs than outcomeindicators. Among indicators ofguideline consistency, self-managementknowledge about asthmacare had the highest ICC (9.83%),while use of inhaled bronchodilatorshad the lowest ICC (3.08%).Among patient outcome indicators,satisfaction with asthma care hadthe highest ICC (9.53%), and hospitalizationhad the lowest ICC(1.35%).

Reliability of Profiling Results

Table 3 gives the reliability ofprofiling results. In general, thereliability of indicators at thephysician group level was acceptablebased on the criterion of 0.80or better. The reliability rangedfrom 0.60 to 0.92. Indicators ofconsistency of care with guidelinesdemonstrated slightly greater reliabilitythan outcome indicators.Self-management knowledge aboutasthma care had the highest reliability(0.92), and patient satisfaction with asthma carehad the second highest reliability (0.91). In contrast,inhaled bronchodilator use and hospitalization were lessreliable, at 0.77 and 0.60, respectively.

Sample Size Needed to Achieve Reliable Profiling

The Figure demonstrates the relationship betweenthe reliability of profiling and desired sample size. Givena fixed ICC, the relationship between the reliability ofprofiling and sample size per physician group was exponential.Indicators with lower ICCs usually require alarger sample size per group than the 126 per group forthe present sample to achieve acceptable reliability. Ifwe set the reliability at the levels of 0.70 and 0.80, largersample sizes (170 and 292, respectively, per group)were needed only for the hospitalization indicator. Ifwe assumed a stringent reliability level of 0.90, largersample sizes were needed for indicators of accessibility,bronchodilator uses, emergency department visits,and hospitalization visits (159, 283, 216, and 658,respectively, per group).


Quality-of-care and performance oversight organizationsare beginning to use consistency of care withasthma guidelines and patient outcomes as indicatorsfor asthma care. Examining performance indicators ofasthma care for 20 California physician groups, wefound that variations attributable to physician groups(ie, ICCs) were small (< 10%). Among these indicators,attributable variation was larger for patient self-managementof asthma care and satisfaction with asthmacare. By contrast, the use of inhaled bronchodilators,emergency department visits, and hospitalizationsshowed less variations attributable to physician groups.We also demonstrated adequate reliability (&#8805;0.80) ofprofiling indicators for asthma care at the physiciangroup level, except for inhaled bronchodilator use andhospitalization.

Our findings have practical implications for managedcare decision makers and organizations engaged in profilingfor asthma or other diseases. The results suggestthat profiling indicators need to be selectedcarefully, based not only on clinical significance butalso on higher ICCs and greater reliability.Variability in quality indicators is one of the attributeslisted as desirable for Health Employer Dataand Information Set performance measures.Variability is not the only desirable attribute for performancemeasures: for example, a measure thatelicits uniformly low performance could still be usefulto document room for improvement. However, itis important for groups interested in quality assessmentto be aware that different indicators havedifferent reliabilities. In considering potential indicators,the reliability can help to determine the feasibilityof a given indicator. Our results suggest thatself-management knowledge about asthma care anduse of inhaled corticosteroids may be good processindicators for physician group profiling, and satisfactionwith asthma care and improvement inhealth status may be useful outcome indicators.Profiles based on inhaled bronchodilator use andhospitalization are less useful.

Our results also reinforce that the reliability ofprofiling results at the physician group level is muchhigher than that at the individual physician levelbecause of the larger sample sizes afforded. Usingthe physician group as the unit of profiling, the presentsample size per group (n = 126) provided reliableprofiling results (reliability, &#8805;0.80), except forthe hospitalization indicator. Because the meannumber of patients per individual physician wasonly 13, the reliability of each profiling indicatorwould be poor, ranging from 0.14 (hospitalization)to 0.54 (self-management knowledge). Hofer andcolleagues10 showed that to achieve a reliability of0.80 for profiling hospitalization and physician visitrates for diabetes care required at least 100 patientsper physician group. In this regard, an indicator thatrequires a sample of approximately 300 patients to bereliable would only be collectable for the largest asthmapractices and would generally not be relevant for individualproviders.

In interpreting our findings, several potential limitationsshould be noted. First, there was a low responserate to the patient survey. The response rates, however,were similar across physician groups. The effect ofa low response rate on comparisons across physiciangroups is important if scores used for profiling differbetween the respondents and nonrespondents.Although we would have liked to compare the characteristicsof respondents and nonrespondents across the20 physician groups, we did not have patient characteristics for nonrespondents.Second, the variationattributable to physiciangroups may be underestimatedbecause of unmeasuredconfounders notincluded in our models. Forexample, we did not collectclinical assessments or nonpatientcharacteristics, suchas the supply of physiciangroups or hospitals in themarket. Lack of adjustmentfor these factors mayincrease random variation.11,12 On the other hand,the variation attributable tophysician groups may beoverestimated because wecannot precisely partitionoverall variations into thephysician group level. Basedon clustering characteristicsamong patients, physicians,and physician groups, it isbetter to partition overallvariations into 3 levels.However, the Pacific BusinessGroup on Health projectdid not collectinformation for individualphysicians, so we could notfurther partition the variationby individual physicianand physician group.Finally, in this study, weintended to identify the differencesamong groups inwhich there were importantvariations attributable togroup effects. However, wedid not attempt to identifythe effect of these variationson outcomes, which isanother criterion to judgethe usefulness of profilingexercises.

In conclusion, for performanceprofiling of asthmacare across 20 physiciangroups, the variations attributableto physician groupswere small. However, thereliability of profiling results was generally acceptablebecause of sufficient case numbers at the physiciangroup level. For profiling, we recommend the use ofclinically important indicators with high reliability.

From the Departments of Health Policy and Management (I-CH, AWW), Epidemiology(GBD, AWW), and Biostatistics (FD, CF), Bloomberg School of Public Health, and theDepartment of Medicine, School of Medicine (GBD, AWW), The Johns Hopkins University,Baltimore, Md.

This study was funded by the Pacific Business Group on Health, San Francisco, Calif.This study was presented in part at the 20th Annual Research Meeting of AcademyHealth;Nashville, Tenn; June 27-29, 2003.

Address correspondence to: Albert W. Wu, MD, MPH, Department of Health Policyand Management, Bloomberg School of Public Health, The Johns Hopkins University, 624North Broadway, Baltimore, MD 21205-1901. E-mail:

Action Against Asthma.

1. US Department of Health and Human Services. Bethesda, Md: US Dept of Health and Human Services; 2000.

National Asthma Education

Program: Guidelines for the Diagnosis and Management of Asthma.

2. National Heart, Lung, and Blood Institute. Bethesda,Md: US Dept of Health and Human Services; 1991.

Guidelines for the Diagnosis and

Management of Asthma: Expert Panel Report No. 2.

3. National Heart, Lung, and Blood Institute. Bethesda, Md: US Dept ofHealth and Human Services; 1997.

Expert Panel Report: Guidelines for

the Diagnosis and Management of Asthma: Update on Selected Topics 2002.

4. National Heart, Lung, and Blood Institute. Bethesda, Md: US Dept of Health and Human Services; 2002.

5. Pacific Business Group on Health Web site. Available at: November 3, 2004.

6. Foundation for Accountability Web site. Available at: Accessed November 3, 2004.

The State of Health Care

Quality: 2002.

7. National Committee for Quality Assurance. Washington, DC: National Committee for Quality Assurance;2002.


8. Marshall MN, Shekelle PG, Leatherman S, Brook RH. The public release ofperformance data: what do we expect to gain? A review of the evidence. 2000;283:1866-1874.

Am J Epidemiol.

9. Gulliford MC, Ukoumunne OC, Chinn S. Components of variance and intraclasscorrelations for the design of community-based surveys and interventionstudies: data from the Health Survey for England 1994. 1999;149:876-883.


10. Hofer TP, Hayward RA, Greenfield S, Wagner EH, Kaplan SH, Manning WG.The unreliability of individual physician "report cards" for assessing the costs andquality of care of a chronic disease. 1999;281:2098-2105.

Multilevel Analysis: An Introduction to Basic and

Advanced Multilevel Modeling.

11. Snijders TAB, Bosker RJ. London, England: Sage Publications; 1999.

Hierarchical Linear Models: Applications and

Data Analysis Methods.

12. Raudenbush SW, Bryk AS. Thousand Oaks, Calif: Sage Publications; 2002.

Med Care.

13. Sixma HJ, Spreeuwenberg PM, van der Pasch MA. Patient satisfaction withthe general practitioner: a two-level analysis. 1998;36:212-229.

J Health Serv Res Policy.

14. Davis P, Gribben B, Lay-Yee R, Scott A. How much variation in clinicalactivity is there between general practitioners? a multi-level analysis of decisionmakingin primary care. 2002;7:202-208.

Ann Intern Med.

15. Greenfield S, Kaplan SH, Kahn R, Ninomiya J, Griffith JL. Profiling care providedby different groups of physicians: effects of patient case-mix (bias) andphysician-level clustering on quality assessment results. 2002;136:111-121.

Health Serv Res.

16. Krein SL, Hofer TP, Kerr EA, Hayward RA. Whom should we profile? examiningdiabetes care practice variation among primary care providers, providergroups, and health care facilities. 2002;37:1159-1180.

Health Care Financ Rev.

17. Solomon LS, Zaslavsky AM, Landon BE, Cleary PD. Variation in patient-reportedquality among health care organizations. 2002;23:85-100.

J Health Serv Res Policy.

18. Campbell M, Grimshaw J, Steen N. Sample size calculations for cluster randomisedtrials: Changing Professional Practice in Europe Group (EU BIOMED IIConcerted Action). 2000;5:12-16.


1998 Asthma Outcomes Survey.

19. Masland M, Wu AW, Diette GB, Dominici F, Skinner EA, Esquivel M. San Francisco, Calif: Pacific Business Group onHealth; 2000.


20. Epstein AM. Rolling down the runway: the challenges ahead for quality reportcards. 1998;279:1691-1696.

Arch Intern Med.

21. Casalino LP, Devers KJ, Lake TK, Reed M, Stoddard JJ. Benefits of and barriersto large medical group practice in the United States. 2003;163:1958-1964.

Health Aff (Millwood).

22. Steinwachs DM, Wu AW, Skinner EA. How will outcomes managementwork? 1994;13(4):153-162.

Arch Intern Med.

23. Diette GB, Wu AW, Skinner EA, et al. Treatment patterns among adultpatients with asthma: factors associated with overuse of inhaled â-agonists andunderuse of inhaled corticosteroids. 1999;159:2697-2704.

Arch Intern


24. Wu AW, Young Y, Skinner EA, et al. Quality of care and outcomes of adultswith asthma treated by specialists and generalists in managed care. 2001;161:2554-2560.

Ann Intern Med.

25. Christiansen CL, Morris CN. Improving the statistical approach to health careprovider profiling. 1997;127:764-768.

Stat Med.

26. DeLong ER, Peterson ED, DeLong DM, Muhlbaier LH, Hackett S, Mark DB.Comparing risk-adjustment methods for provider profiling. 1997;16:2645-2664.

Med Care.

27. Cowen ME, Strawderman RL. Quantifying the physician contribution to managedcare pharmacy expenses: a random effects approach. 2002;40:650-661.

Stat Med.

28. Turner RM, Omar RZ, Thompson SG. Bayesian methods of analysis for clusterrandomized trials with binary outcome data. 2001;20:453-472.

WinBUGS 1.3: User Manual.

29. Spiegelhalter D, Thomas A, Best N. Cambridge,England: MRC Biostatistics Unit, Institute of Public Health; 2000.

Psychometric Theory.

30. Nunnally J, Bernstein I. New York, NY: McGraw-HillInc; 1994.