Summative Evaluation Results and Lessons Learned From the Aligning Forces for Quality Program

Objective: To report summative evaluation results from the Aligning Forces for Quality (AF4Q) initiative, the Robert Wood Johnson Foundation’s (RWJF’s) signature effort to improve quality of care from 2005 to 2015.

Methods: This was a longitudinal mixed methods program evaluation (ie, multiphase triangulated evaluation) of 16 grantee “alliances” from across the country, funded by RWJF as part of the AF4Q initiative. Grantees were selected in a nonexperimental manner and were charged with deploying interventions in 5 main programmatic areas to improve health and healthcare in their communities.

Results: Except for a small proportion of outcomes, there were no major differences in the rate of longitudinal improvement in AF4Q communities, compared with control communities, on quantitative outcomes related to the Triple Aim. Although the majority of the measures improved in both AF4Q and non-AF4Q communities, there were some exceptions to this improving trend, most noticeably in the cost of care and population health. There was also considerable heterogeneity across communities in terms of programmatic areas and the scale and scope of interventions in these areas. Although a number of AF4Q alliances implemented robust interventions in specific areas, often advancing strategies useful for others in the field, no AF4Q alliance pursued and aligned all 5 AF4Q programmatic areas in a robust way. In addition, whereas all alliances were able to garner the participation of multiple stakeholders initially, sustaining this participation and securing new sources of funding after RWJF support ended proved challenging for many alliances.

Conclusion and Policy and Practice Implications: While the AF4Q program did not attain the ambitious community-level changes predicted by its sponsor at the program’s outset, it did produce pockets of success on some dimensions for particular alliances. A number of factors explain the less-than-expected impact of the AF4Q initiative on community health and the observed variation in alliance sustainability and intervention strength. These include differing acceptance of the AF4Q initiative’s theory of change, variation in the experience and capacity of the alliance communities selected for the program, differences in alliances’ local healthcare market context, and the changing programmatic requirements for alliances participating in the AF4Q initiative. The variation in AF4Q program outcomes offers important lessons for those engaged in regional health improvement work.

Aligning Forces for Quality (AF4Q) was the Robert Wood Johnson Foundation’s (RWJF’s) large and ambitious initiative focused on reforming local health systems in 17 communities by 2015. (RWJF selected 17 communities to participate in the AF4Q initiative over 3 phases of the program. However, one of the grantee communities, Central Indiana, was dropped from the AF4Q program. Thus, we refer to 16 communities for the remainder of the text.) The AF4Q initiative was launched in 2006, before passage of the Affordable Care Act (ACA) in 2010, and was built with community-based multi-stakeholder alliances serving as the “backbone organizations” in what many now describe as “collective-impact” approaches to addressing complex social problems.1 The program included multiple interventions and goals to significantly improve community health, which were developed and revised throughout its nearly decade-long lifespan.

Because of RWJF’s ambitious goals for the AF4Q initiative and the importance of this major community-based health reform effort, there are many stakeholders (eg, policy makers; philanthropic organizations; community-based coalitions, alliances, and multi-stakeholder groups; healthcare providers; healthcare payers; patients and patient advocates; and researchers and evaluators, among others) interested in knowing the impact of this program and whether the initiative as a whole, or specific parts of it, were successful.

Serving as an entryway into the independent, external evaluation team’s detailed findings, this article provides an overview of how we arrived at our approach for studying the impact of the AF4Q initiative (development of the evaluation research design and identification of RWJF’s theory of change for the program), as well as a summary of findings from each component of our assessment of the success of the program. Rather than define the success of the AF4Q initiative based on a single metric, we have approached assessment of the AF4Q program’s success as multidimensional. Findings are presented from multiple levels and perspectives, with the goal of allowing stakeholders to focus on the dimensions of success that are most meaningful to their particular contexts and needs. Barriers to success and key lessons learned from the AF4Q experience are also discussed.

Other articles in this supplement provide greater detail on the background and evolution of the program, the research design of the evaluation, the specific summative assessments of success in each of AF4Q’s 5 main programmatic areas (measurement and reporting of provider performance, consumer engagement [CE], quality improvement [QI], equity, and payment reform), an assessment of the community-level outcomes of the program, and an assessment of how the AF4Q multi-stakeholder alliances were positioned for the future when the program ended in 2015.

AF4Q Evaluation Research Design and Methods

Our evaluation research design was organized within an overarching logic model of the program (described below) and used a multiphase design (formally referred to as a “methodological triangulated design”2), which included sub-projects with independent methodological integrity in each of the AF4Q initiative’s 5 main programmatic areas along with an additional sub-project that focused on the organization and governance of the AF4Q alliances.3 For this nearly decade-long, complex program, we also employed elements of the Realistic Evaluation approach advanced by Pawson and Tilley, which states that “Programs work (have successful ‘outcomes’) only in so far as they introduce the appropriate ideas and opportunities (‘mechanisms’) to groups in the appropriate social and cultural conditions (‘contexts’).”4

Our evaluation, which focused on the development of an empirically based assessment of the final outcomes of the program, consisted of 3 phases. The first was a foundational phase in which the evaluation program logic model was developed, key research questions were identified, and data collection was put into motion. The second phase included systematic monitoring and measurement of program interventions and program and environmental changes. These changes included progress on intermediate outcomes and tracking AF4Q community involvement in the myriad of regionally focused healthcare improvement programs that overlapped during the AF4Q program period (eg, Chartered Value Exchange project, the Beacon Community Program, the Health Information Technology Extension Program, CMS Innovation Center programs).5 The third was a summative phase in which we, informed by the formative work, moved to answering the following 2 research questions: (1) Was the AF4Q program successful? (2) What lessons were learned from the AF4Q initiative that can inform those interested in improving local healthcare systems and the health of populations residing within these communities?

The AF4Q Initiative’s Theory of Change

The following statements describing the theory of change of the AF4Q initiative were published by RWJF leadership early in the implementation of the program:

“We launched the first phase of Aligning Forces for Quality: The Regional Market Project, a long-term, multi-million dollar commitment, to help a number of test communities re-weave the fabric of their own local health care system into a stronger, more resilient, higher-quality tapestry of care across its fullest continuum. We call it AF4Q. This is not piecemeal, incremental, short-term (and unsuccessful) health system reform as usual. It has no politics or partisanship of its own. If it did, it wouldn’t work and we wouldn’t do it. Rather, it is an unprecedented regionally determined clinical, social and economic market realignment that calls upon enlightened and aspirational local leadership, intentional collaboration, reliance on evidence-based action, public reporting and accountability, and public participation in deciding how quality health care is delivered to the community. AF4Q is a first-of-its-kind effort that is as much a call to community action as it is a potent formula to bring the best possible medical care and peace of mind to as many people and their families as possible.”6

“In June 2008 the Robert Wood Johnson Foundation (RWJF) launched phase II of Aligning Forces for Quality, a long-term, $300 million initial commitment to help up to twenty geographically, economically, and demographically diverse communities reweave the fabric of their health care systems to be stronger, more resilient, and of higher quality across the full continuum of care.”

“The RWJF’s objective is to help the Aligning Forces communities improve the quality of care for everyone in these communities by 2015. If these communities, with widely varying provider and payer systems, racially and ethnically diverse populations, and differing chronic disease rates, can improve care with this concerted focus, then improving quality nationwide is achievable.”7

As the quotes illustrate, the AF4Q initiative was designed to be a far-reaching and ambitious initiative with multiple levels and types of outcomes. Additionally, RWJF added additional components and modified expectations for the established program areas over the course of implementation. Although each of the participating alliances were required to implement activities in all 5 programmatic areas, there was variability in how they approached the work and how much time and resources the alliances dedicated to them.

To better conceptualize the numerous dimensions of success and the mechanisms for achieving those outcomes in RWJF’s theory of change, we developed an AF4Q logic model. We did this through careful review of RWJF documentation about the program and conversations with RWJF leaders. Additionally, the AF4Q logic model was updated over time to reflect changes in the program. It is described in full detail in another article in this supplement which discusses the background, history, and evolution of the AF4Q program.8

The AF4Q logic model can be thought of as containing 4 main components. The first component relates to the creation and development of a functioning multi-stakeholder alliance that identifies priorities for improving health and healthcare in the community and sets strategy for how to accomplish these goals. The second component involves selecting and implementing interventions in the 5 AF4Q programmatic areas and attempting to align these interventions so they complement one another; as specified in the program name, alignment of the various program interventions was originally envisioned to be a key differentiator in the AF4Q initiative. The third component involves measuring the intermediate and long-term community-level outcomes that RWJF hypothesized would stem from the aforementioned interventions, with intermediate outcomes focused more on short-term process changes while long-term outcomes focused on changes in communitywide health and cost outcomes. The fourth component relates to sustaining the capacity for improvement of the health system at the community level and “scaling up” the overall effort to reach the entire population, especially in the context of all of the other changes occurring in the broader healthcare system. Related to this fourth component, the degree to which AF4Q alliances became models for reform for others across the country, an explicit goal of RWJF, is another important dimension to consider when evaluating the success of the AF4Q initiative.

To answer the first research question (Was AF4Q successful?), we organized the discussion by starting with the most global components of the logic model—communitywide outcomes, sustainability, and opinions about the degree to which the AF4Q initiative provided models for other communities. The discussion of success later focuses on the implementation and alignment of specific program interventions. Finally, we provide some reasons for the pattern of results that were observed.

Was the AF4Q Program Successful?

Communitywide Outcomes, Sustainability, and Thought Leader Opinions

Intermediate and Long-Term Communitywide Outcomes—Little Impact on Outcomes Relative to Non-AF4Q Communities. The most ambitious aspiration for the AF4Q initiative was that it would result in improvements in community (population) health and healthcare quality measures and yield more value for resources spent on healthcare services within the participating AF4Q communities. To study whether this result was achieved, we selected a broad set of measures (144 in total) early in the evaluation and then grouped them into the 3 Triple Aim categories of better health, better care, and lower costs that we could follow throughout the life of the AF4Q program. We monitored these same measures in both AF4Q and non-AF4Q communities because it enabled us to assess whether any improvements (or declines) in measures in AF4Q communities could more assuredly be attributed to participation in the program, as opposed to a more general trend in outcome measure improvement happening across the country. Details about the selection of measures, the data sources for these measures, and our analyses, which utilized a difference-in-differences approach, are outlined in the article by Shi et al in this supplement.9

Generally speaking, we conclude from our analyses that, for the majority of the measures, there were no major differences in the rate of improvement in the AF4Q communities relative to the rate of improvement in other locations across the United States. It should be noted that the data typically indicated improvement in most (but not all) of these measures over time, with similar improvements occurring over the same period in communities that were not exposed to the AF4Q intervention. The exceptions to this improving trend are found mostly in cost of care and population health. We also examined variation across the 16 AF4Q communities to see if there were any alliances with significant deviations from these average AF4Q effects. Although there were differences in the rates of measure improvement across the 16 AF4Q alliances, for the most part, they did not differ significantly from the trends in comparison communities.

External Thought Leader Perspectives on the Success of the AF4Q Initiative. We asked our national thought leader sample about their perceptions of the impact of the AF4Q initiative. The dominant view was that there was no clear evidence suggesting the large-scale impact RWJF originally envisioned. However, many respondents thought there was some positive impact of the AF4Q initiative, depending on the community and specific programmatic area. A number of respondents also suggested that the lessons generated from the activity and experimentation across the 16 funded alliances was, perhaps, the biggest value of the AF4Q program. The variation in respondents’ views of the impact of the AF4Q initiative may be best illustrated by the following 2 quotes:

“[AF4Q impact is] probably variable, [as] some regions have done very well and probably RWJF support helped catalyze some of that. Other regions were less successful. So, I think perhaps 1 of the contributions is that there was learning about … successful regional collaboration and what some of the challenges could be.”

“I know there are a lot of people who think that Aligning Forces is a waste. But I don’t. I think that this was a worthy…experiment, and I think that…there were lessons learned. I think that there are clearly things that somebody there could probably make a list of what not to do. But I think that’s very important. I really do. I think there is as much value [in] knowing what best practices are and to knowing what you should absolutely avoid. And that certainly could inform what happens elsewhere.”

Each of these respondents emphasized the importance of learning from the AF4Q initiative, both in terms of what “worked,” but also in terms of challenges encountered and reasons for not achieving success. Learning was, of course, a goal of RWJF, and on one level, there has been quite a bit of information produced and exchanged from this large, almost 10-year AF4Q experiment (eg, materials from the alliances themselves, and publications and documents from the evaluation team, the national program office [NPO], and RWJF’s communications arm). One point raised by some thought leaders and alliance stakeholders was that RWJF’s communications about the AF4Q initiative were focused exclusively on highlighting successes, which seemed to cast the program in only a positive light. Additionally, several respondents felt that the communications efforts for the AF4Q initiative were “too much” and led, in some cases, to the program being “driven by communications.” As a result, some have suggested that RWJF’s communications efforts failed to take full advantage of the lessons learned from less or unsuccessful activities, ultimately not capitalizing on the value of learning or achieving the stated goal of informing others across the country about the complexity of executing the AF4Q agenda and its components.

Sustaining Capacity for Community-Level Improvement—Future Uncertain for Nearly Half of AF4Q Alliances. RWJF expected that the AF4Q alliances would become self-sustaining by 2015, presumably because of the clear value their efforts would have created for the communities. All 16 alliances experienced some success in recruiting multiple stakeholders and establishing a structure for planning and decision making for AF4Q initiatives, particularly early in the program. However, many of the alliances were continuing to wrestle with the challenge of sustainability at the end of the program. A substantial number of alliances were at risk for several reasons, including unclear strategic direction after the AF4Q program, inadequate financial support for the alliance’s work, a lack of relevant community leadership, or some combination of these factors. Thus, after more than 8 years of support by RWJF, including significant technical assistance and coaching in sustainability planning, it appears that nearly half of the AF4Q alliances face uncertain futures, and for some, the real possibility of ceasing operations (in fact, 2 of the 16 alliances ceased operating by the end of the program).

The remaining AF4Q alliances will probably continue to operate, although some plan to do so in a form that differs from the “neutral convener,” multi-stakeholder model emphasized during the AF4Q initiative. For example, some alliances are restructuring and shifting their orientation to particular stakeholder groups that control significant resources, such as employers, large healthcare systems, or even the state. Others are attempting to “diversify” by building from their AF4Q experience in areas such as measurement development to offer fee-for-service (FFS) products. The paper by Alexander et al in this supplement discusses alliance sustainability in more detail.10

Implementation and Alignment of AF4Q Programmatic Areas

Few Alliances Address and Align All Programmatic Areas, Resulting in Weaker Than Expected AF4Q Intervention. As the program logic model indicates, AF4Q alliances were asked to take up work in several specific areas and align the work in these areas to have maximum impact. Although 2 alliances seriously pursued all 5 AF4Q programmatic areas, none of the alliances implemented interventions in all 5 of those areas in a robust and integrated manner. Three alliances were able to meaningfully embrace 3 programmatic areas, such as using the results of performance measures to target specific areas for QI interventions and opportunities for reducing healthcare disparities.

The most frequent strategy was for alliances to focus most of their attention and resources on 2 programmatic areas, including the work most closely related to the history and roots of the lead alliance organization, while devoting substantially less effort to other areas. For example, alliances with a history of publicly reporting performance measures tended to continue their emphasis in this area. Alliances disproportionately represented by healthcare providers on their governing boards most often focused on QI work. Few AF4Q alliances had long-standing track records in the areas of CE, equity, and payment reform. As such, these 3 areas received relatively less attention across the full set of AF4Q alliances.

It should be noted that whereas there were no alliances that implemented the entire AF4Q programmatic package in a robust and scalable way, there were clear exemplars in most programmatic areas, raising the question of how to assign value to those that mastered a single area relative to those who tried many areas but mastered none. Because RWJF’s vision, based on its program theory, was to implement and align all programmatic areas, our judgment is based on this expectation; however, as stated above, stakeholders can apply their own interpretation and weighting to these various dimensions. Other articles in this supplement describe the various types of AF4Q interventions within the programmatic areas, the dose of those interventions across the 16 alliances, and the outcomes that these interventions targeted.11-14

Public Reporting: Emphasized the Most, Sustainability Challenged. The AF4Q initiative demonstrated that with financial and technical support, a wide variety of voluntary stakeholder coalitions were able to develop public reports of ambulatory provider quality. Although a few alliances had experience in public reporting prior to the start of the AF4Q initiative, most did not but were still able to produce a public quality report as required by the program. However, the contents of these reports varied considerably in terms of health conditions covered and patient and provider populations included, reflecting differences in local environments and alliance strategies. The degree to which these reports were proactively disseminated and created broad awareness of quality differences among community stakeholders also varied. The most significant contribution of the AF4Q alliances to provider transparency in their communities was in the areas of ambulatory quality and ambulatory patient experience. There was much less contribution in the area of inpatient performance reporting, largely because alliance stakeholders often viewed this as an area already “occupied” by other state or national organizations, including CMS. The article by Christianson et al in this supplement provides more details, but the challenges faced by alliances in maintaining their reporting efforts are substantial.11

At the conclusion of the AF4Q initiative, 6 alliances seemed highly likely to continue their public reporting efforts, 4 appeared less certain, and 6 ceased reporting altogether, primarily because of the inability to develop stable funding sources and stakeholder skepticism about the value of public reporting. RWJF has provided post-AF4Q support to the Network for Regional Health Improvement (NRHI),15 through RWJF’s Collaborative Health Network, to share lessons across multi-stakeholder alliances, including exploring options for how cross-region collaboration might offer economies for sustaining measurement and reporting activities. Given the evolving nature of the healthcare transparency movement and the measurement and reporting landscape, the larger question of which entities (eg, federal government, state governments, provider associations, alliances, or some combination) are best positioned to assume broad responsibility for providing and paying for transparency remained unresolved at the end of the AF4Q initiative.

Quality Improvement (QI): Half of AF4Q Alliances Advanced or Created Community QI Infrastructure. AF4Q alliances were slow to establish QI infrastructure and launch QI activities that served an alliance’s entire community and orchestrated collaboration across all (including competing) delivery systems. This may be attributed to the emphasis on measurement and public reporting early in the AF4Q program. However, as reported by McHugh et al in this supplement,12 all alliances eventually implemented QI activities, either on their own or through delegating the work to partners who had greater QI expertise. There were commonalities across alliances in terms of their foci (eg, most alliances encouraged adoption of patient-centered medical home [PCMH] processes) and the approaches employed (eg, most alliances established learning collaboratives involving community providers). However, the quantity, depth, and overall success of activities varied across the 16 communities.

What we refer to as the AF4Q QI legacy can best be divided into 3 categories. The first category, “new infrastructure,” included communities in which the AF4Q initiative drove the creation of a community-based QI effort that was sustained beyond the end of the program (3 communities). The second category, “further faster,” included 5 communities in which the AF4Q program advanced QI initiatives that were already in place. The remaining 8 alliances made up the “limited QI legacy” category that included communities in which there was limited impact in the area of QI, either because of a strategic decision not to pursue direct QI interventions (often opting to focus on performance measurement and reporting) or due to various challenges that affected implementation.

Consumer Engagement (CE): Involving Consumers in Governance—Embraced More Than Community Health-Level CE Efforts. The AF4Q alliances faced challenges in launching CE efforts: few alliances had staff with CE expertise, most did not have existing consumer constituencies, and others struggled with stakeholder buy-in. However, as detailed by Greene et al in this supplement,13 over the life of the initiative, CE activities were implemented in 3 areas that RWJF encouraged during some or all of the AF4Q program and a fourth area that several alliances voluntarily developed: (1) providing self-management training, (2) increasing consumer awareness and use of public reports of health provider quality, (3) involving consumers in alliance governance, and (4) integrating consumers into ambulatory care QI teams.

Involving consumers in alliance governance was the area most often embraced, with half of alliances either changing organizational bylaws to institutionalize consumer representation on the board of directors or becoming “highly dedicated” to promoting the consumer voice in committee work. Two alliances, which had staff with prior experience with CE work and leadership that strongly believed in CE, embraced self-management programming in the community, and the same 2 alliances integrated patients into ambulatory care QI efforts. Two alliances created highly consumer-friendly public reports of provider quality based upon guidelines derived from the research literature, 5 alliances’ reports were moderately consumer friendly, and 7 were not very consumer friendly (2 alliances were not reporting at the end of the AF4Q initiative).

Overall, 10 alliances embraced at least 1 of the 4 areas of CE (1 alliance embraced 3 areas, and 2 alliances embraced 2 areas). Four other alliances tried at least 1 CE area in a serious but limited manner, and 2 alliances made only minimal effort across the areas of CE. Despite the variation in embracing CE, most alliance leaders reported that the consumer perspective became more important and integrated into many aspects of the alliances’ work, and credited the AF4Q initiative with that change. The AF4Q program also made a substantial contribution to the academic and practitioner literature on CE, helping to define the concept and detail how programs can be implemented.16-21

Advancing Healthcare Equity—Losing the Forest for the Trees. To help drive their efforts to reduce racial and ethnic healthcare disparities, the alliances were initially directed to measure, act, and achieve: first, to measure local disparities, then to address the observed disparities, and finally, to adjust and streamline efforts to achieve their healthcare equity goals. However, most of the alliances’ efforts to address healthcare disparities were either long delayed or completely derailed by the initial focus on measurement. As noted by Jean-Jacques et al in this supplement,14 while the alliances were directed to promote the standardized collection of data on patient race, ethnicity, and primary language spoken (REL); link these data to healthcare system performance measures to allow for quality measures to be stratified by REL; and establish communitywide systems for tracking local healthcare disparities, only 2 alliances made it through all of these steps by the end of the AF4Q initiative.

Even among the 5 alliances that made at least moderate progress toward advancing their community’s capacity to measure and track local healthcare disparities, these efforts did not consistently translate into the implementation of programs or other interventions to address local disparities. Although all of the alliances participated in at least some small pilot programs that aimed to address disparities, only 3 implemented or contributed to the development of programs on a scale large enough to move the needle toward advancing healthcare equity to any extent and be measured at the population level.

The alliances that implemented the most robust programs to address disparities tended to have established relationships with community organizations and institutions with a track record of serving minority, low-income, or otherwise underserved populations. They generally focused on improving access to care and the quality of care provided to uninsured, poor, or minority residents by disseminating the PCMH model across safety-net clinics. Perhaps most important, they shifted quickly toward working to address disparities even if their initial efforts to identify local disparities by stratifying claims-based or health record—based quality measures by REL failed.

Payment Reform Pilots Raise Visibility—Little Sustained Success. An area of focus that was added in the AF4Q initiative’s third funding phase was an emphasis on payment reform, which largely referred to the consideration of alternative payment models, such as bundled payments, episode-based payments, risk-based payments, and fee enhancements for PCMH implementation, among other models. RWJF added this requirement after the ACA was passed and at a time when CMS and others were calling attention to the perverse, and as some have labeled, “toxic” nature of the traditional FFS-based reimbursement model. As this expectation was added, RWJF provided access to some of the leading thinkers on the topic of alternative payment arrangements and convened a number of in-person meetings to discuss options.

The most frequent response across the AF4Q alliances was to host a town hall meeting on the subject where community constituents could hear presentations by national experts and discuss if and how these alternative models might be a priority for the community. More than half of the AF4Q alliances extended this work further by designing payment reform pilots, recruiting participants, and implementing small-scale reforms. Most pilots involved supplemental payments to FFS reimbursement for the implementation of changes, such as the provision of care management services or implementation of PCMHs. Not surprisingly, actual progress in changing the way healthcare is paid for in a community was difficult for almost all alliances, including those engaged in the pilot efforts.

To be fair, because alliances weren’t payers or providers, there was some skepticism among alliances about what they could actually do beyond convening and educating. Making significant progress in this area was likely an unrealistic expectation of an alliance until the major payer—CMS—was on board. Although CMS recently announced goals to move the majority of its payments away from FFS to value-based payment models, FFS was still the dominant payment method during the AF4Q era.22

As discussed above, the results of the AF4Q initiative are best described as less successful than expected. We were not able to detect measurable and significant differences in the trends on community-level quality measures for AF4Q communities, relative to non-AF4Q communities, although many measures did appear to show a positive trend across AF4Q and non-AF4Q communities alike. In terms of programmatic interventions, whereas no alliance robustly implemented all AF4Q programmatic areas to scale across a community, there was considerable variation in the scope and fidelity of the interventions implemented in each specific program area.

As reported in the other articles in this supplement that focus on programmatic implementation,11-14 there were alliances that were considered exemplars in specific programmatic areas, with most programmatic areas having some exemplars, although no single alliance achieved exemplar status in all programmatic areas. These alliances arguably advanced the thinking and implementation of the work in these areas and created value for others interested in doing similar work. The results also suggest that while all AF4Q alliances were able to recruit and engage various stakeholders initially, the future sustainability of the alliances themselves, and the programmatic areas, varies significantly, with a substantial number of alliances at risk for not sustaining their work (and themselves as organizations) by the time the program ended in 2015.

Reasons Why the AF4Q Program Did Not Have the Intended Impact

What factors are likely responsible for variation in the success of the AF4Q initiative on the various dimensions of success of AF4Q? We discuss this below and try to highlight some of the key factors before moving to a discussion of some broader lessons that can be learned from the AF4Q experience.

Inherent Tension in a Community-Focused, Externally Driven Program. At its core, the AF4Q initiative was designed as a program to bring stakeholders at the local level together to address healthcare challenges. From the program’s inception to its end, there was near universal support of that approach among staff and stakeholders in the alliances, and, as discussed above, among the external thought leaders. The AF4Q program’s theory of change was layered on top of the community-based multi-stakeholder approach and required implementation of an externally defined set of programmatic interventions. This approach created tensions around the degree of alliance autonomy versus the level of prescriptiveness of the program requirements, with RWJF providing more flexibility in the latter funding periods. Some alliances advocated strongly throughout the program for more local autonomy, while others leaned toward wholesale acceptance of the program. There is evidence that those in the former group experienced less difficulty in the transition from the AF4Q to post-AF4Q period. It could be that those alliances had comparatively stronger local decision-making processes in place, had attracted stakeholders who value local autonomy, or had some combination of those and other factors.

Concerns With RWJF’s Theory of Change and AF4Q Logic Model. With respect to the programmatic areas included in the AF4Q initiative, there was a wide variety of opinions among national thought leaders, alliance leaders, and alliance stakeholders about the value of specific programmatic areas and interventions selected within the programmatic areas, as well as different points of view regarding what could be reasonably expected to result from focusing on these areas. The following quote from 1 of the national thought leaders we interviewed provides an illustration:

“Honestly, I think their [RWJF’s] theory of change was not robust enough. The leverage of consumer engagement and reports cards is a pretty long lever, with a lot of steps in between to get there. And the assumptions about how these measures would change behaviors for all sectors—you know insurance, providers, employers, other stakeholders, and consumers—I felt their theory of change was too confused to actually get them a clear path to results.”

A number of thought leaders shared this view, although there was variation in which programmatic areas they thought should have received more or less attention. Alliance leaders and those doing the work in the communities felt the same way, with the most concern expressed about the questionable impact of steering publicly reported quality measures to consumers. While there was general agreement about the importance of engaging consumers, many noted the variation in alliances’ approaches and capacity in this area and the relatively underdeveloped set of tools available for engaging patients at the community level.

Interestingly, the original plan for the AF4Q initiative was to conduct a pilot program with 4 communities and learn formatively from the experience of those 4 before expanding the program. Although that approach could have helped RWJF tailor the goals and approach of the program, the decision to expand to 10 additional communities happened less than 1 year after the pilot communities were selected.

The Selection of AF4Q Communities. RWJF indicated, on several occasions, that it selected communities that were best aligned with the AF4Q program goals and ultimately were well positioned to succeed in executing the AF4Q theory of change. As it turned out, there was variation in the experience and capacity of the alliances and stakeholders in the communities selected, the contexts in which they operated, and the commitment of stakeholders to all aspects of the AF4Q logic model. For example, we know from our in-depth qualitative and tracking work that many alliances and stakeholders were never committed to particular AF4Q programmatic areas, such as publicly reporting quality data, despite indicating as much in their initial funding applications. Thus, the selection process was not successful in identifying participants committed to all aspects of the original AF4Q program.

It seems that whereas RWJF wanted communities that were capable of launching meaningful interventions in all programmatic areas, it also considered other factors when selecting communities, including geographic diversity (eg, different parts of the country, whole states, some rural areas, and large urban centers), reputation for expertise in one of the AF4Q programmatic areas, and perhaps other factors. Additionally, RWJF awarded AF4Q grants to a handful of communities that had to create an alliance from the ground up to implement the program, demonstrating that proven capacity was not a primary selection criterion. Although these decisions made for interesting diversity among alliances, not all of the communities selected were ideally situated to execute the full AF4Q agenda.

RWJF’s Ambitious (and Expanding) Program Expectations. The AF4Q programmatic requirements for participating alliances were multifaceted and ambitious, but also vague enough to allow community leaders to interpret them loosely during their proposal development processes. As one alliance leader stated after reviewing the initial call for proposals, “[T]here was so little money on the table, and it was so comprehensive, you had to do everything.” That alliance chose to apply in the end, with the thinking that the program could not literally hope to tackle all of the program elements deeply, but that it was seeking a bold path forward to stimulate new thinking and approaches. This was not an uncommon situation across the alliances, and alliance staff and participants frequently noted the tension between the number of programmatic requirements and alliance capacity throughout the program years—especially as the alliances were held accountable for their progress, or lack thereof, on all program dimensions.

Even though the initial program was ambitious, alliances were further challenged by the addition of new programmatic areas in which most had little experience (eg equity, payment reform, inpatient quality measurement and reporting) and changing requirements from the original programmatic areas. “We can’t boil the ocean” became a common refrain among alliance leaders by the midpoint of the program. Given the variation in alliance capacity and expertise, combined with limited resources and an ambitious program that continued to expand, it is not surprising that alliances tended to fully embrace only a few programmatic areas within their respective comfort zones.

Variation in the Environment in Which AF4Q Was Launched. There was significant variation in important contextual factors across communities that likely led to variation in both long-term outcomes and implementation of programmatic interventions and the coalescence of multi-stakeholder collaboration. For example, the recession of 2008 hit the Detroit area particularly hard and had implications for the degree of stakeholder participation that could be expected, the selection of interventions to be implemented, and the pace at which these were implemented. State policy and the degree of participation of various divisions of state government, especially after passage of the ACA, also were important and varied on multiple dimensions, such as: decisions to expand Medicaid or not, decisions to build and operate a state-based insurance exchange or to default to the federal exchange, and others, including the synergy between state-based QI efforts (eg, PCMH initiatives) and transparency efforts (eg, availability and interest of the state in public reporting of provider quality measures).

What Lessons Were Learned From the AF4Q Initiative?

In addition to seeking to transform the healthcare systems in 16 AF4Q communities, RWJF sought to create “models for national reform.” There are many lessons that have been learned as a result of the AF4Q initiative in specific areas, and these are well documented in our publications and documents produced by the AF4Q NPO and RWJF. In this section, we answer our second summative research question: What lessons were learned from the AF4Q initiative that can inform those interested in improving local healthcare systems and the health of populations residing within these communities?

Alliance and Community “Visionary” Leadership Are Key Building Blocks for Meaningful Change

Alliances that facilitate more congruence or complementarity between their goals and the goals of their participants not only provide a foundation for more effective coordination of effort, but also promote internalization of alliance goals by participating organizations. This may engender more deeply rooted, institutional change in the community that extends beyond the specific programmatic work of the alliance and increases the chances of sustained, communitywide efforts to improve health. Obviously, this is a greater challenge if the leaders of key organizations are not involved in alliance decision making, are unaware of the alliance’s initiatives, or see participation in the alliance only as a “community service” obligation rather than an important activity for their home organizations.

Successful adoption, implementation, and sustainability of multi-stakeholder alliance initiatives therefore depends not only on attracting and retaining the right mix of stakeholders, but also the right level of stakeholder participants—those who have the influence and resources to make things happen in the community—and an enthusiasm for alliance goals. Recruiting and retaining such participation is not an easy task. “Visionary” leadership is necessary to provide a compelling call to action for the alliance and the community, and to persuade key members of the community to make the alliance and its programs a priority. This is especially true because these individuals and their organizations are often faced with competing demands for their time and resources, and may fail to recognize how success for the alliance can be a positive for their own organization. Specifically, a visionary leader must be able to convey a core ideology and an envisioned future.23 Core ideology defines what the alliance stands for (ie, core values) and why the alliance exists (ie, core purpose); it provides an identity that transcends the identities of individuals and organizations, the legal form or structures, and the alliance’s activities. An envisioned future defines what the alliance aspires to become, achieve, and create. It specifies a compelling long-term goal that serves as a unifying, focused reason for collective effort, and vividly describes what it will be like to achieve the goal.

Stakeholder Engagement in Alliance Initiatives: Often Unequal—Undermining True Cooperation

Providing a neutral forum and safe space for diverse community stakeholders to discuss and plan initiatives to advance community health is seen as critical to both alliance success and the best interest of the community. However, stakeholder groups tend to experience different levels of commitment and engagement in these types of initiatives, albeit for different reasons. Employers frequently fail to see a short-term, direct connection between initiatives and their bottom line. Consumers are rarely integrated into leadership positions and tend to be dominated by those with greater technical expertise on healthcare matters. Providers often are concerned about the impact QI initiatives may have on their autonomy and reimbursement levels. Finally, whereas healthcare plans often agree to contribute data, they are sometimes reluctant to fully engage in initiatives such as communitywide payment reform goals because of the potential negative impact on their competitive advantage.

This suggests that simply adding diverse stakeholders to the alliance or its decision-making bodies may not yield the type of cooperation and engagement necessary to promote the changes necessary to “move the needle” on community healthcare quality. Sponsors of these efforts may therefore expect too much from various stakeholders, relative to what the literature and experience suggest actually happens in practice.24-26 For example, employers are often assumed to be the lynchpins of local healthcare reform efforts because they control a significant amount of healthcare purchasing and are greatly affected by increases in costs. In practice, however, this group tends to be less engaged in local reform efforts, in part because the local healthcare delivery system often serves only a fraction of a national or global corporation’s employees.

From a different perspective, the mixed level of engagement on the part of different stakeholders also calls into question whether alliances, such as those in the AF4Q initiative, are truly partnerships among equals or tend to be disproportionately influenced by particular groups of stakeholders (eg, providers who have the most to gain or lose from these efforts). Effective management of stakeholder interests and participation is required to offset the tendencies of a particular group to dominate in what is intended to be a collaborative undertaking.

Untested Assumptions in the AF4Q Logic Model Exposed by Realities of Implementation in Community Contexts

While the AF4Q model of aligning stakeholders and programmatic activities is intuitively appealing, the feasibility of developing and implementing an integrated strategy, as envisioned by RWJF, was undermined by a paucity of evidence in certain programmatic areas, sometimes vague programmatic requirements, and the limited resources and time available to the alliances given the magnitude of the task. Many of these issues stemmed from the “theory of change” that undergirded the AF4Q initiative, which contained several important but untested assumptions about how quickly multi-stakeholder alliances could become operational, the degree to which diverse stakeholders would coalesce around strategies capable of effecting meaningful community change, and the overreliance on specific interventions with little track record in broader community contexts.

The inability of alliances to easily implement prescribed solutions to an inherently complex problem was exacerbated by their limited expertise and capacity to address each and every programmatic area on its own, let alone ensure that these different areas were “aligned.” Given these issues, and to reduce uncertainty and manage scarce resources, many health improvement activities prescribed by RWJF were spun off to partner organizations with more established expertise or emphasis was given to programmatic areas for which there was already a history of activity and a greater level of internal expertise.

Engaging Patients and Consumers Is Desirable—Evidence and Agreement on Best Approaches Varies

As discussed by Christianson et al in this supplement,11 public reporting of provider quality was a key component of the AF4Q’s strategy to improve health in alliance communities. A central expectation of this approach was that consumers would use this information to choose higher-quality providers and enhance their interaction with providers. However, most of the AF4Q alliances charged with this work were not consumer-facing organizations. Without consumer constituencies of their own and sufficient budgets for marketing and dissemination, they were not equipped to reach a broad range of consumers.

Current limitations of healthcare quality measures (including variation in both construction and types of measures reported) and their dissemination, as well as constraints on individual consumer choice in selecting providers due to provider shortages or limited networks, further impeded the use of public reports by consumers. Thus, the model of individual healthcare consumers researching provider quality and “voting with their feet” was not viewed by many regional stakeholders as a viable primary strategy for advancing QI over the short term. Instead, our findings indicate that public reports were most useful for providers in terms of benchmarking their own performance and using those data to drive improvement.

Private Investment Alone May Not Be Enough to Sustain Initiatives for Public Good

When looking toward sustaining community health improvement initiatives, a major roadblock alliances encountered was finding a funding source to continue the work. The characteristics of a public good often lead to a free-rider problem (ie, all those who benefit from the public good do not pay their fair share to produce or maintain the public good), which, in turn, presents challenges to maintaining funding for programs or services that no one stakeholder group feels obligated to support. Adding to this problem is the fact that many of the local stakeholders involved in alliances represented private organizations that lacked discretionary budgets for work not directly tied to their organization’s bottom line.

Because the government is the primary body that communities look to for funding public goods and regulating the free-rider issue, partnering with state-level stakeholders to identify a long-term public funding stream, or at least a mix of public and private funding, may be critical to moving this type of work forward. Federal resources may also provide funding, and although the CMS Innovation Center has provided funding opportunities synergistic to some of the alliances’ goals, these awards are almost always tied to specific projects and deliverables, rather than grants, for a community to use at its discretion based on stakeholders’ priorities.

Potential models to overcome the public good problem and support this kind of work on an ongoing basis include: (1) creating a wellness trust supported by savings identified through reduced claims costs that occur as a result of program implementation, community stakeholder donation, or some combination of the two; (2) generating a permanent source of funding by instituting a tax on healthcare premiums; and (3) selling products to specific stakeholders (eg, specialized reports for providers), although this approach has the potential of creating tension within alliances if a particular stakeholder group feels they are essentially funding “public” activities for private gain.

Broader State and National Policy Initiatives—Positive and Negative Impacts on Community Health Alliances

Passage of the ACA and the Health Information Technology for Economic and Clinical Health Act after the launch of the AF4Q initiative led to the funding of several large-scale and national initiatives, including many demonstration programs sponsored by the CMS Innovation Center. To capitalize on these funding opportunities and efforts at the state level, some external thought leaders believed that it would have been more powerful for the alliances to be engaged in a state-level, rather than regional-level, service area. According to this opinion, doing so would have allowed alliances to more effectively leverage state government, including enlisting the purchasing power of the state’s Medicaid program and benefits for state employees.

Regardless of service area, some alliances were able to position themselves as partners in state or national efforts by providing expertise, data, or their unique role to these efforts. This enabled these alliances to expand their effort to a larger “stage” and secure needed resources to support and sustain their work. For example, several alliances that are still active have secured or are seeking to be a part of the State Innovation Model initiatives in their respective states. However, some alliances and their stakeholders viewed these policy changes as creating competition, as other organizations in the same market sought resources to pursue similar work, which had the potential to weaken the commitment of local stakeholder groups to the alliance and its programs. In addition, having to juggle multiple projects often stretched alliances beyond their capacity and required staffing support that did not necessarily have sustainable funding streams once projects ended.27,28

Publicly Announcing Ambitious Goals Can Dampen Opportunities for Learning

As the quotes earlier in this article illustrate, RWJF made strong pronouncements about its expectations for the AF4Q initiative. These expectations were not merely stretch goals; they suggested that the AF4Q program would accomplish major community-level health improvement by 2015 by fundamentally redesigning care and delivery systems, engaging all patients in their care, and providing more efficient and high-valued care. These goals were bold and probably unrealistic, as discussed by 1 of our thought leader respondents:

“$300 million may seem like a big investment, but it’s a drop in the bucket relative to big changes in the healthcare system. So, making big pronouncements at the outset of what’s going to change is not so useful and can only backfire.”

When bold goals are set and a large investment is made, there can be considerable pressure to highlight only the successes along the way. Unfortunately, that pressure can undermine important critical conversations and learning opportunities that come from challenges and failures. An alternative approach may have been to describe the AF4Q initiative as equivalent to a venture capital program, where RWJF was making investments in many communities, knowing that a few might be successful and some might be unsuccessful, but regardless, important lessons could be learned from all.


To our knowledge, the AF4Q program was the single largest privately funded attempt to improve the quality and value of healthcare services in various regions of the United States. Although a community-based multi-stakeholder approach was employed, the AF4Q program had a very specific theory of change. As a result, participating alliances did not have complete autonomy; they were required to focus their work on very specific programmatic areas, such as increasing transparency through the publication of provider quality measures. Thus, while the lessons from the AF4Q program add tremendously to the literature on the effectiveness of voluntary multi-stakeholder initiatives, it is important for others interested in these lessons to consider their applicability given the AF4Q context.

Although the AF4Q program did not accomplish the myriad of improvements in communitywide outcomes by 2015 as envisioned, it certainly provided a treasure trove of experience and information that could yield valuable learning lessons for other communities or stakeholders. There are high-level lessons, derived from the experiences of all 16 alliances over a period of almost 10 years, and alliance-specific lessons, including why and how decisions were made and the impact of those decisions. Many of these lessons have been documented in AF4Q publications from the evaluation team and others,15,29 and interested readers are encouraged to consult these resources.

This supplement includes 3 “perspective” articles about the AF4Q experience, written by RWJF leadership, which paid for and sponsored the AF4Q initiative; the AF4Q NPO, which was housed at George Washington University; and the NRHI, which is a network of regional health improvement collaboratives including many of the AF4Q alliances and other regional members. This supplement also includes the perspective of guest editor Donald Berwick, MD, MPP, FRCP, who has followed the AF4Q program over the life of its existence, and served as the initial chair of the AF4Q’s National Advisory Committee.

In addition, while most of the articles in this supplement are written by the AF4Q evaluation team, which was charged with conducting formal research to evaluate the AF4Q initiative, each of these perspectives, and the guest editorial, provides a unique “grounded” viewpoint on the AF4Q program. Readers are encouraged to consult these articles, which include the authors’ views of the program over its history, including opinions about its success and key lessons learned. These articles also describe how the AF4Q experience has informed future work; for example, the article by Miller and Weiss in this supplement discusses how lessons from the AF4Q initiative have informed RWJF’s current focus of creating a “Culture of Health” across America.30

