Currently Viewing:
The American Journal of Managed Care Special Issue: Health Information Technology
Improving Adherence to Cardiovascular Disease Medications With Information Technology
William M. Vollmer, PhD; Ashli A. Owen-Smith, PhD; Jeffrey O. Tom, MD, MS; Reesa Laws, BS; Diane G. Ditmer, PharmD; David H. Smith, PhD; Amy C. Waterbury, MPH; Jennifer L. Schneider, MPH; Cyndee H. Yonehara, BS; Andrew Williams, PhD; Suma Vupputuri, PhD; and Cynthia S. Rand, PhD
Information Retrieval Pathways for Health Information Exchange in Multiple Care Settings
Patrick Kierkegaard, PhD; Rainu Kaushal, MD, MPH; and Joshua R. Vest, PhD, MPH
The 3 Key Themes in Health Information Technology
Julia Adler-Milstein, PhD
Leveraging EHRs to Improve Hospital Performance: The Role of Management
Julia Adler-Milstein, PhD; Kirstin Woody Scott, MPhil; and Ashish K. Jha, MD, MPH
Electronic Alerts and Clinician Turnover: The Influence of User Acceptance
Sylvia J. Hysong, PhD; Christiane Spitzmuller, PhD; Donna Espadas, BS; Dean F. Sittig, PhD; and Hardeep Singh, MD, MPH
Cost Implications of Human and Automated Follow-up in Ambulatory Care
Eta S. Berner, EdD; Jeffrey H. Burkhardt, PhD; Anantachai Panjamapirom, PhD; and Midge N. Ray, MSN, RN
Primary Care Capacity as Insurance Coverage Expands: Examining the Role of Health Information Technology
Renuka Tipirneni, MD, MSc; Ezinne G. Ndukwe, MPH; Melissa Riba, MS; HwaJung Choi, PhD; Regina Royan, MPH; Danielle Young, MPH; Marianne Udow-Phillips, MHSA; and Matthew M. Davis, MD, MAPP
Adoption of Electronic Prescribing for Controlled Substances Among Providers and Pharmacies
Meghan Hufstader Gabriel, PhD; Yi Yang, MD, PhD; Varun Vaidya, PhD; and Tricia Lee Wilkins, PharmD, PhD
Health Information Exchange and the Frequency of Repeat Medical Imaging
Joshua R. Vest, PhD, MPH; Rainu Kaushal, MD, MPH; Michael D. Silver, MS; Keith Hentel, MD, MS; and Lisa M. Kern, MD
Information Technology and Hospital Patient Safety: A Cross-Sectional Study of US Acute Care Hospitals
Ajit Appari, PhD; M. Eric Johnson, PhD; and Denise L. Anthony, PhD
Automated Detection of Retinal Disease
Lorens A. Helmchen, PhD; Harold P. Lehmann, MD, PhD; and Michael D. Abràmoff, MD, PhD
Trending Health Information Technology Adoption Among New York Nursing Homes
Erika L. Abramson, MD, MS; Alison Edwards, MS; Michael Silver, MS; Rainu Kaushal, MD, MPH; and the HITEC investigators
Electronic Health Record Availability Among Advanced Practice Registered Nurses and Physicians
Janet M. Coffman, PhD, MPP, MA; Joanne Spetz, PhD; Kevin Grumbach, MD; Margaret Fix, MPH; and Andrew B. Bindman, MD
The Value of Health Information Technology: Filling the Knowledge Gap
Robert S. Rudin, PhD; Spencer S. Jones, PhD; Paul Shekelle, MD, PhD; Richard J. Hillestad, PhD; and Emmett B. Keeler, PhD
Currently Reading
Overcoming Barriers to a Research-Ready National Commercial Claims Database
David Newman, JD, PhD; Carolina-Nicole Herrera, MA; and Stephen T. Parente, PhD

Overcoming Barriers to a Research-Ready National Commercial Claims Database

David Newman, JD, PhD; Carolina-Nicole Herrera, MA; and Stephen T. Parente, PhD
Lessons learned about data governance and distribution from a voluntary healthcare claims repository, the Health Care Cost Institute, a nonprofit research organization
Data Use

Once governance around data holding has been satisfied, data holders need to address the issues of data use. License agreements, at a minimum, need to deal with the HIPAA. In addition, data holders often address the rights to publish and ownership of any intellectual property developed. Finally, the inadvertent or intentional release of company confidential information is of particular concern to data contributors, regardless of contribution model, and may require scrutiny of research and data products.

For most VCMs and MCMs, data are licensed after a proposal is made by researchers. Typically, a research committee reviews the proposal and may require research to be governed by an institutional review board. At HCCI, a scientific review committee reviews all research proposals, including those with funding from peer-reviewed institutions such as the National Institutes of Health. This committee is composed solely of academic researchers, and data contributors have no representation on the committee. HCCI data use, furthermore, is limited to the proposed purposes.

Unlike most VCMs or MCMs, HCCI does not license data directly to the researcher. Rather, HCCI licenses data to the researcher’s university on behalf of researchers. Licensing data to universities can make the data more widely available to research teams. HCCI’s Academic Research Partnerships allow a university to license multiple projects (including student projects) per year. In addition to saving the time of negotiating individuals’ data licenses, this allows a university seeking to support or build a research program around healthcare claims data to do so more efficiently. As discussed elsewhere here, it would also allow universities that do not have sufficiently robust research technologies to leverage the technologies HCCI has developed for data distribution and research project management. The result is secure uses of the data by multiple research teams.

However, licensing data to universities is not as straight forward as one might think, as university attorneys are inclined to want to renegotiate standard license terms. Because HCCI is a VCM, certain terms are simply not negotiable, and these constraints must be recognized in the license. Some clauses, such as choice of jurisdiction, are quickly resolved. Issues that tend to slow down the licensing process are confidentiality provisions and intellectual property rights.

To protect against the intentional release of confidential information, MCMs and VCMs have to take steps to reduce potential violations. HCCI developed a set of masking rules generally designed to deal with how prices are publicly reported, as these are the salient pieces of information that give rise to antitrust concerns. These rules do not constrain research but do prohibit reporting of analyses at the data contributor level. As is typical with health data, the rules allow raw reporting of specific service prices within a specific geography when the data meet a critical threshold of observations. When researchers who want to report on a specific service in a specific geography do not meet the thresholds, they are required to either expand the geographic area, select a different geographic area to highlight, or aggregate the service data.

Distribution and Accessibility

After governance issues, the data holder faces 2 major technical challenges to distributing data in a world of relatively rich storage options: transport and updating. Data holders whose purpose is to help inform knowledge about healthcare also face another challenge—making data accessible to research teams who lack the current means of processing large data. As HCCI’s public mission is to promote research and reporting of healthcare costs and utilization, it has had to address these challenges and is developing innovative solutions.


After the data are licensed and are ready for distribution, the data holder needs to transport the data to the end user. Depending on the size of the data, the technical capacity of the data holder, and the technical capacity of the data recipient, 2 forms of transport are commonly used: physical transport of a data drive through a courier service or electronic transmission through a secure gateway.

Physical drive transmission is very common with Medicare data, and Buccaneer, the Medicare vendor, physically transports secure files to research teams around the country.21 The Agency for Healthcare Research and Quality also provides physical copies of hospital inpatient data.2

Less common is the use of electronic transfer gateways, although this method is gaining popularity, particularly as costs decline. Gateways offer greater control over data transmission, require a direct link between repository and recipient, and require recipients to provide more detail about their data security. However, there are technological constraints to transmitting large databases over networks. For example, 1 year of employer-sponsored insurance claims data from HCCI are approximately 325 GB in a flat file. If the data are transmitted in a standardized analytic format (such as a *.sas7bdat or *.dta), the base file is much larger, which can make transmission on relatively weak connections impossible. In HCCI’s experience, transmissions to major research universities work well in the range of less than 100 GB, resulting in multiple file segments that must be reassembled at the university. Transmissions to recipients who lack significant bandwidth can be prohibitively slow.


The main benefit of a secure gateway is that it can also help data holders simplify the otherwise complex process of data updating. Unlike many other forms of data, health data are not static. Claims data go through 3 stages—filing, processing, and adjudication—with the timing of each dependent on the source of claims, the payer, and the patient. In the case of prescription claims, filing, processing, and adjudication can be accomplished on the same day. In the case of organ transplants, it may take more than 18 months to complete the adjudication.

For claims that are not fully processed, the data holder needs to decide whether it will offer raw, consolidated (detailed transactions with current claim payment statuses), or adjudicated (paid and final) claims.22 Some APCDs update data holdings (using either raw or consolidated data) to include new payment information monthly or quarterly. Thus, a researcher who received data in January likely will have different data from another researcher who received data later in the year. One alternative is to provide only adjudicated claims data, which is the option that HCCI uses. As a result, filed but unpaid claims are not included. All data holders need a policy on run-out. If claims are collected and aggregated by year (be it calendar or fiscal), the data holder needs to decide how many months need to pass before it declares a year complete. In the case of HCCI, data are collected by calendar year with 6 months of run-out. This means that for care provided in 2007, HCCI does not receive data until 2008. HCCI also collects data with 12-, 18-, and 24-month run-out and, therefore, holds at least 99% of adjudicated claims.

Data holders will find the use and distribution of annual health data files complicated by changes in the data contributors. In an MCM, the number of data contributors should not retroactively increase as long as regulations do not change. In a VCM, the number of data contributors may change. HCCI has set as a policy that data should be retrospective from 2007 onward; therefore, a researcher requesting a data update will need file replacement if HCCI acquires more data contributors.


Most academic research teams do not have the dedicated resources needed to store, process, and analyze large claims files. As of today, many claims data holders, including HCCI, cut customized files for researchers. Although some researchers may need only aggregated data, even highly aggregated databases can overwhelm the most advanced desktops. Successful users of “big health data” will need to invest in technology if other solutions are not available.

The researchers’ challenges are also the data holders’ challenges, particularly if the data holder is committed to supporting research. One solution is to push academic researchers to better leverage their existing infrastructures. As noted previously, HCCI is licensing data to universities and research institutions for use by multiple research teams. This partnership approach allows universities with data centers to use their processing assets for multiple projects over time and with a standard update schedule. Another solution is outsourcing the hosting for individual projects. HCCI may provide research teams with a set of suggested vendors whose security and technical requirements meet HCCI standards.

Alternatively, data holders who wish to promote research may need to invest in information technologies to make their data more widely available. This is the approach that HCCI is considering as a mechanism for advancing research and collaboration on healthcare data.23 A robust virtual data research center could provide collaborative research teams with access to data within a secure environment. Such an environment would include 1) a query-ready database with limited data-merge capacity, 2) a secure portal by which authorized users can access the data, 3) secure and private storage for researchers, 4) isolated silos to keep research teams segregated and separated, 5) monitoring capacity, and 6) analytic tools. At this time, few vendors have both the processing power and analytic prowess to support research infrastructure.


Advances in information technology have made it possible for healthcare researchers and other stakeholders to have access to greater healthcare data. Problems exist with the scale of the data, standards for data holding, governance, reporting and privacy, and rights to ownership. However, the greatest challenge—making “big data” accessible to research teams with great ideas but limited resources—remains unsolved. Future advances on the horizon may help to eliminate some of these concerns, but healthcare leaders should not expect an explosion of newdata insights without investment in basic health services research infrastructure and technology.

Author Affiliations: The Health Care Cost Institute (DN, C-NH), Washington, DC; Department of Finance, Carlson School of Management, University of Minnesota (STP), Minneapolis, MN.

Source of Funding: None.

Author Disclosures: Dr Newman and Ms Herrera are employed by the Health Care Cost Institute. Dr Parente reports no relationship or financial interest with any entity that would pose a conflict of interest with the subject matter of this article.

Authorship Information: Concept and design (DN, C-NH, STP); analysis and interpretation of data (C-NH); drafting of the manuscript (DN, C-NH); critical revision of the manuscript for important intellectual content (DN, C-NH); obtaining funding (STP); administrative, technical, or logistic support (DN, C-NH); supervision (DN, STP).

Address correspondence to: Carolina-Nicole Herrera, MA, Director of Research, Health Care Cost Institute, Inc, 1310 G St NW, Suite 720, Washington, DC 20005. E-mail:
1. About HCCI. Health Care Cost Institute, Inc, website. Published 2013. Accessed May 16, 2013.

2. Overview of HCUP: Healthcare Cost and Utilization Project. HCUP/Agency for Healthcare Research and Quality website. Published November 2009. Accessed May 16, 2013.

3. Miller PB, Love D, Sullivan E, Porter J, Costello A; for State Coverage Initiatives, Robert Wood Johnson Foundation. All-payer claims databases: an overview for policymakers. Published May 2010. Accessed March 3, 2014.

4. Hansen L, Chang S. Health Research Data for the Real World: The MarketScan Databases. Ann Arbor, MI: Thomson Reuters; 2012.

5. Navathe AS, Conway PH. Optimizing health information technology’s role in enabling comparative effectiveness research. Am J Manag Care. 2010;16(12 suppl HIT):SP44-SP47.

6. Qualified Entity Program. CMS website. Published October 2013. Accessed March 4, 2014.

7. Health reform implementation timeline. The Henry J. Kaiser Family Foundation website. Published 2013. Accessed September 19, 2013.

8. Blumenthal D. Launching HITECH. N Engl J Med. 2010; 362(5):382-385.

9. Awards and requests for proposals. All-Payer Claims Database Council website. Accessed March 3, 2014.

10. Our data. Midwest Health Initiative website. Published 2013. Accessed May 15, 2013.

11. Multi-payer claims database (MCPD). California Healthcare Performance Information System website. Published 2014. Accessed December 15, 2014.

12. About us. FAIR Health, Inc website. Published 2013. Accessed December 15, 2013.

13. Solutions. Castlight Health, Inc website. Published 2013. Accessed May 16, 2013.

14. Interactive State Report Map. All-Payer Claims Database (APCD) Council website. Accessed December 15, 2014.

15. Costello A, Taylor M. Standardization of data collection in all-payer claims databases: fact sheet. APCD Council website. Published January 2011. Accessed May 16, 2013.

16. Love D, Sullivan E. Cost and funding considerations for a statewide all-payer claims database: fact sheet. APCD Council website. Published March 2011. Accessed May 16, 2013.

17. Human Services Research Institute (HSRI) Health Data Warehouse proposal prepared for the State of Maine, RFP #201207352. Maine Health Data Organization website. Published August 27, 2012. Accessed March 3, 2014.

18. WHIO fact sheet 2013. Wisconsin Health Information Organization website. Published 2013. Accessed June 2013.

19. Health care prices. Virginia Health Information website. Accessed December 15, 2014.

20. The Center for Consumer Information & Insurance Oversight: grants to states to support health insurance rate review and increase transparency in health care pricing. CMS/CCIIO website. Published 2013. Accessed March 3, 2014.

21. Buccaneer for CMS. Chronic conditions data warehouse: Medicare administrative data user guide, version 2.0, 2013. Accessed May 16, 2013.

22. Understanding the process (part 3 - implementation). OnpointCDM: Claims Data Manager website. Published 2009. Accessed March 3, 2014.

23. Newman D, Frost A, Herrera C, Parente S. The need for a smart approach to big health care data. HealthAffairs Blog website. Published January 27, 2014. Accessed March 3, 2014.
Copyright AJMC 2006-2018 Clinical Care Targeted Communications Group, LLC. All Rights Reserved.
Welcome the the new and improved, the premier managed market network. Tell us about yourself so that we can serve you better.
Sign Up