Innovations in Chronic Care Delivery Using Data-Driven Clinical Pathways

This paper demonstrates that data-driven clinical pathways can be developed using electronic health record data to facilitate innovations in practice-based care delivery for chronic disease management.
Published Online: December 23, 2015
Yiye Zhang, MS; and Rema Padman, PhD


Objectives: Chronic diseases are common, complex, and expensive health conditions that can benefit from innovations in healthcare service delivery enabled by information technology and advanced analytic methods. This paper proposes a data-driven approach, illustrated in the context of chronic kidney disease (CKD), to develop clinical pathways of care delivery from electronic health record (EHR) data.

Study Design: We analyzed structured and de-identified EHR data from 2009 to 2013 of 664 CKD patients with multiple chronic conditions.

Methods: Machine learning algorithms were used to learn data-driven and practice-based clinical pathways that cluster patients into subgroups and model the co-progression of their encounter types, diagnoses, medications, and biochemical measurements. Given a pattern of biochemical measurements, our algorithm identifies the most probable clinical pathways, and makes predictions regarding future states, with and without temporal information. CKD stages, their complications, and common medications are included in the clinical pathways.

Results: Using the EHR data of 664 patients who were initially in CKD stage 3 and hypertensive, we identified 7 patient subgroups—each distinguished primarily by the type of complications suffered by the patients. Our algorithm demonstrates fair accuracy (up to 44% and 75%, respectively) in learning the most probable clinical pathways and predicting future states associated with temporal patterns of biochemical measurements and patient subgroups.

Conclusions: Data-driven clinical pathway learning summarizes multidimensional and longitudinal information from EHRs into clusters of common sequences of patient visits that may assist in the efficient review of current practices and identifying potential innovations in the care delivery process.

Am J Manag Care. 2015;21(12):e661-e668

Take-Away Points
  • The availability of high-volume, time-stamped, and individual-level health data is beginning to facilitate clinical interventions that are personalized and predictive. 
  • Healthcare service delivery can benefit greatly from advanced statistical and machine learning models and algorithms that can learn personalized insights from electronic health record (EHR) data. 
  • Data-driven clinical pathways that describe the co-progression of encounter types, diagnoses, medications, and individual biochemical measurements can be learned from EHR data, using statistical and machine learning methods to support the review of current practices and innovate healthcare delivery approaches. 
  • Our proposed methodology is generalizable to other clinical conditions and can accommodate varying numbers of clinical and other relevant factors.
According to the World Health Organization, 60% of all deaths, worldwide, can be attributed to chronic diseases such as diabetes, heart disease, stroke, and cancer; they are also a major cause of poverty and lack of economic development.1 As part of a multi-pronged effort to address this challenge, innovations in chronic care delivery are beginning to leverage advanced statistical and machine learning models and algorithms to obtain new insights into care quality, outcomes, and cost.2-4 Machine learning is the science of constructing algorithms that learn from large volumes of data in order to facilitate decision making by generating potentially new insights; it has gained widespread implementation across many industries today.5 Just a few examples of machine learning applications are speech recognition, self-driving cars, and personalized online experiences.6-8

Although innovations driven by machine learning have seen tremendous success,9,10 subsequently resulting in improved service performance, productivity, and growth,11-13 for a variety of reasons, the healthcare industry has been relatively slow to incorporate these techniques into decision-support applications and to adapt to resulting changes.14-16 For instance, in making treatment decisions, many clinicians may prefer to use clinical practice guidelines (CPGs) over predictions generated by machine learning algorithms—algorithms which may seem like a “black box” with little relevance to actual clinical decision making.17 However, many of the current clinical decision support capabilities, whether CPG-embedded electronic health record (EHR) interactivity or computerized provider order entry (CPOE) application, are designed by humans and target the “average patient.”

As the Precision Medicine initiative states,18 we are now in an era in which clinical interventions need to be personalized and predictive, and so should decision support recommendations. To meet this objective, it is no longer sufficient to rely on CPGs, often created based on consensus opinions or randomized clinical trials that have strict enrollment criteria. Rather, with the tremendous amount of data being accumulated in EHRs from the enactment of the Health Information Technology for Economic and Clinical Health (HITECH) Act as part of the American Recovery and Reinvestment Act,19 healthcare service delivery can also benefit greatly from advanced statistical and machine learning models and algorithms that can learn potentially useful insights from large amounts of highly detailed data collected daily, as part of routine care delivered in multiple, diverse settings.

Traditional topics in machine learning include classification and unsupervised learning.5 Classification refers to the method of labeling unknown data to target variables through training a classification model using labeled data. Logistic regression and naïve Bayes are examples of classification algorithms.5 For example, Lee et al used logistic regression to predict 7-day mortality from heart failure in emergent care using initial vital signs, clinical and presentation features, and laboratory tests.20 Unsupervised learning refers to the identification of latent groups in the data. Unlike classification, which is also called “supervised learning,” unsupervised learning does not have true labels, typically does not have true labels, and users need to predefine the number of latent groups. K-means and hierarchical clustering are 2 of the most common unsupervised learning algorithms.5

Zhang et al used a variant of the K-means clustering algorithm to design more efficient order sets from historical order data in a pediatric inpatient setting.21 Order sets are groups of relevant orders traditionally clustered together by clinical experts and used within CPOE; this is an example of a manually designed healthcare information technology application that requires significant labor- and knowledge-intensive effort for maintenance and update. In the same study, Zhang et al demonstrated that order sets can be created using machine learning algorithms, with the resulting data-driven order sets requiring less physical and cognitive workload in usage because the methods were trained to find the optimal combinations of orders that matched, with order data generated from actual work flow. In addition to these classical approaches, many advanced machine learning algorithms have been developed and applied over the years to facilitate a more efficient, safer healthcare system.22-25

In this paper, we present a machine learning approach for learning the most probable, data-driven clinical pathways from the EHR data of patients with chronic kidney disease (CKD), and predicting the most probable upcoming interventions at any stage, given recent history. CKD is a chronic condition that currently affects more than 26 million US adults, with an additional 73 million at increased risk for the disease.26 It is also associated with increased risk for cardiovascular disease and acute kidney injury (AKI), and the majority of the patients also suffer from comorbidities such as hypertension and diabetes.26 Consequently, CKD management is complex and expensive, and a large proportion of the US Medicare budget every year is allocated for the treatment of CKD.27 Specifically, the per person per year average cost of treating CKD was $23,128 in 2011—more than twice the average cost of treating non-CKD conditions in the Medicare population ($11,103).27 With the cost increasing and quality of life decreasing as the disease progresses to end-stage renal disease (ESRD),27 there is a growing imperative to pursue innovations in service delivery and management of CKD and other chronic conditions that may generate improved health outcomes, cost savings, and patient satisfaction.4

Additionally, generating the highest quality scientific evidence and associated practice recommendations for chronic conditions such as CKD is a continuing challenge for the healthcare field.3 One of the most recent CPGs for CKD was published by the National Kidney Foundation’s Kidney Disease Outcomes Quality Initiative in 2012, which is an update of its 2007 guideline. However, of its 7 key recommendations, only 2 recommendations received the highest grade from the Evidence Review Team of the guideline Work Group for strength of recommendation (“recommend” vs “suggest”), and the highest grade for quality of evidence (“high” vs “moderate,” “low,”  “insufficient”), while other recommendations received lower grades for strength of recommendations and for the quality of evidence.28

In this paper, we propose that evidence from actual practices, particularly those that include large number of patients in local treatment settings over reasonable durations, may be used to assist guideline development. We present methods for knowledge extraction from data using machine learning algorithms, and demonstrate that such knowledge can be regarded as practice-based, data-driven clinical pathways. Clinical pathways translate CPG recommendations into an actionable plan such as flow charts, and are used by more than 80% of US hospitals for at least 1 intervention.29 This research aims to develop clinical pathways not strictly based on CPGs, but practice-based evidence learned from data. An overall framework of our approach that supports a learning healthcare system is presented in Figure 1.


Prior Work

Data-driven clinical pathway learning has garnered research interest since the 1990s,30-38 but there is limited research on machine learning approaches for the problem. Recently, Lakshmanan et al used a type of clustering algorithm, called DBScan, to cluster patients’ history prior to pathway learning, and applied SPAM, an algorithm to find frequent patterns in pathways, to associate patterns with patient outcomes.33 Huang et al used topic model, a recently developed probabilistic method, for learning latent topics from documents, to discover clinical pathway patterns from EHR event logs.38 Zhang et al modeled clinical pathways as Markov chains that included the co-progression of multiple interventions and diagnoses, and visualized them to allow identification of variations in care and outcomes across latent patient subgroups.39

In this paper, we combine clustering and temporal modeling to elicit common clinical pathways from the data. Specifically, given patient characteristics and a sequence of laboratory observations from multiple laboratory tests, we illustrate methods to learn the most probable sequence of clinical interventions that are associated with the laboratory observations, and to make predictions about patients’ impending conditions as a result of the interventions. This approach allows us to link patients’ biochemical responses with clinical interventions and with specific outcomes, thus providing a novel methodology for data-driven clinical pathway learning.

Clustering of Patients

PDF is available on the last page.
Adult ADHD Compendium
COPD Compendium
Dermatology Compendium
Diabetes Compendium
Hematology Compendium
Immuno-oncology Compendium
Lipids Compendium
MACRA Compendium
Neutropenia Compendium
Oncology Compendium
Pain Compendium
Reimbursement Compendium
Rheumatoid Arthritis Compendium
Know Your News
HF Compendium
Managed Care PODCAST