Innovations in Chronic Care Delivery Using Data-Driven Clinical Pathways
Yiye Zhang, MS; and Rema Padman, PhD
According to the World Health Organization, 60% of all deaths, worldwide, can be attributed to chronic diseases such as diabetes, heart disease, stroke, and cancer; they are also a major cause of poverty and lack of economic development.1 As part of a multi-pronged effort to address this challenge, innovations in chronic care delivery are beginning to leverage advanced statistical and machine learning models and algorithms to obtain new insights into care quality, outcomes, and cost.2-4 Machine learning is the science of constructing algorithms that learn from large volumes of data in order to facilitate decision making by generating potentially new insights; it has gained widespread implementation across many industries today.5 Just a few examples of machine learning applications are speech recognition, self-driving cars, and personalized online experiences.6-8
Although innovations driven by machine learning have seen tremendous success,9,10 subsequently resulting in improved service performance, productivity, and growth,11-13 for a variety of reasons, the healthcare industry has been relatively slow to incorporate these techniques into decision-support applications and to adapt to resulting changes.14-16 For instance, in making treatment decisions, many clinicians may prefer to use clinical practice guidelines (CPGs) over predictions generated by machine learning algorithms—algorithms which may seem like a “black box” with little relevance to actual clinical decision making.17 However, many of the current clinical decision support capabilities, whether CPG-embedded electronic health record (EHR) interactivity or computerized provider order entry (CPOE) application, are designed by humans and target the “average patient.”
As the Precision Medicine initiative states,18 we are now in an era in which clinical interventions need to be personalized and predictive, and so should decision support recommendations. To meet this objective, it is no longer sufficient to rely on CPGs, often created based on consensus opinions or randomized clinical trials that have strict enrollment criteria. Rather, with the tremendous amount of data being accumulated in EHRs from the enactment of the Health Information Technology for Economic and Clinical Health (HITECH) Act as part of the American Recovery and Reinvestment Act,19 healthcare service delivery can also benefit greatly from advanced statistical and machine learning models and algorithms that can learn potentially useful insights from large amounts of highly detailed data collected daily, as part of routine care delivered in multiple, diverse settings.
Traditional topics in machine learning include classification and unsupervised learning.5 Classification refers to the method of labeling unknown data to target variables through training a classification model using labeled data. Logistic regression and naïve Bayes are examples of classification algorithms.5 For example, Lee et al used logistic regression to predict 7-day mortality from heart failure in emergent care using initial vital signs, clinical and presentation features, and laboratory tests.20 Unsupervised learning refers to the identification of latent groups in the data. Unlike classification, which is also called “supervised learning,” unsupervised learning does not have true labels, typically does not have true labels, and users need to predefine the number of latent groups. K-means and hierarchical clustering are 2 of the most common unsupervised learning algorithms.5
Zhang et al used a variant of the K-means clustering algorithm to design more efficient order sets from historical order data in a pediatric inpatient setting.21 Order sets are groups of relevant orders traditionally clustered together by clinical experts and used within CPOE; this is an example of a manually designed healthcare information technology application that requires significant labor- and knowledge-intensive effort for maintenance and update. In the same study, Zhang et al demonstrated that order sets can be created using machine learning algorithms, with the resulting data-driven order sets requiring less physical and cognitive workload in usage because the methods were trained to find the optimal combinations of orders that matched, with order data generated from actual work flow. In addition to these classical approaches, many advanced machine learning algorithms have been developed and applied over the years to facilitate a more efficient, safer healthcare system.22-25
In this paper, we present a machine learning approach for learning the most probable, data-driven clinical pathways from the EHR data of patients with chronic kidney disease (CKD), and predicting the most probable upcoming interventions at any stage, given recent history. CKD is a chronic condition that currently affects more than 26 million US adults, with an additional 73 million at increased risk for the disease.26 It is also associated with increased risk for cardiovascular disease and acute kidney injury (AKI), and the majority of the patients also suffer from comorbidities such as hypertension and diabetes.26 Consequently, CKD management is complex and expensive, and a large proportion of the US Medicare budget every year is allocated for the treatment of CKD.27 Specifically, the per person per year average cost of treating CKD was $23,128 in 2011—more than twice the average cost of treating non-CKD conditions in the Medicare population ($11,103).27 With the cost increasing and quality of life decreasing as the disease progresses to end-stage renal disease (ESRD),27 there is a growing imperative to pursue innovations in service delivery and management of CKD and other chronic conditions that may generate improved health outcomes, cost savings, and patient satisfaction.4
Additionally, generating the highest quality scientific evidence and associated practice recommendations for chronic conditions such as CKD is a continuing challenge for the healthcare field.3 One of the most recent CPGs for CKD was published by the National Kidney Foundation’s Kidney Disease Outcomes Quality Initiative in 2012, which is an update of its 2007 guideline. However, of its 7 key recommendations, only 2 recommendations received the highest grade from the Evidence Review Team of the guideline Work Group for strength of recommendation (“recommend” vs “suggest”), and the highest grade for quality of evidence (“high” vs “moderate,” “low,” “insufficient”), while other recommendations received lower grades for strength of recommendations and for the quality of evidence.28
In this paper, we propose that evidence from actual practices, particularly those that include large number of patients in local treatment settings over reasonable durations, may be used to assist guideline development. We present methods for knowledge extraction from data using machine learning algorithms, and demonstrate that such knowledge can be regarded as practice-based, data-driven clinical pathways. Clinical pathways translate CPG recommendations into an actionable plan such as flow charts, and are used by more than 80% of US hospitals for at least 1 intervention.29 This research aims to develop clinical pathways not strictly based on CPGs, but practice-based evidence learned from data. An overall framework of our approach that supports a learning healthcare system is presented in Figure 1.
METHODS Prior Work
Data-driven clinical pathway learning has garnered research interest since the 1990s,30-38 but there is limited research on machine learning approaches for the problem. Recently, Lakshmanan et al used a type of clustering algorithm, called DBScan, to cluster patients’ history prior to pathway learning, and applied SPAM, an algorithm to find frequent patterns in pathways, to associate patterns with patient outcomes.33 Huang et al used topic model, a recently developed probabilistic method, for learning latent topics from documents, to discover clinical pathway patterns from EHR event logs.38 Zhang et al modeled clinical pathways as Markov chains that included the co-progression of multiple interventions and diagnoses, and visualized them to allow identification of variations in care and outcomes across latent patient subgroups.39
In this paper, we combine clustering and temporal modeling to elicit common clinical pathways from the data. Specifically, given patient characteristics and a sequence of laboratory observations from multiple laboratory tests, we illustrate methods to learn the most probable sequence of clinical interventions that are associated with the laboratory observations, and to make predictions about patients’ impending conditions as a result of the interventions. This approach allows us to link patients’ biochemical responses with clinical interventions and with specific outcomes, thus providing a novel methodology for data-driven clinical pathway learning.
Clustering of Patients
To accommodate the heterogeneity in the patient population and improve model accuracy, we group patients according to similarity of their clinical history prior to pathway learning and prediction. We expect patients’ pathways to branch out as their health conditions and corresponding treatments evolve in different ways. Therefore, prior to pathway learning and prediction, we use hierarchical clustering to cluster patients’ pathways into subgroups according to longest common subsequence (LCS) distance measure.40 LCS is the longest subsequence that 2 sequences have in common, while preserving the order of occurrence of the items in the sequences, but items are possibly separated. LCS has been widely applied in biomedical research as a similarity measure used in trajectory analysis and protein sequence analysis.40 The distance measure, dLCS, is then computed as the difference between the sum of the lengths of 2 sequences and twice their LCS. (Details are in the eAppendix, available at www.ajmc.com.) Hence, dLCS is affected by the length of the identified subsequence, and the lengths of both sequences; for example, given the same length of LCS, dLCS is bigger for 2 long sequences than 2 short sequences. Therefore, clustering using dLCS allows us to group patients who not only share similarity in clinical interventions, but also have similar durations of treatment. The optimal number of clusters is determined using Silhouette, a measure commonly used in cluster analysis.41 In this study, we consider clusters that have 10 or fewer patients as outliers, and plan to evaluate rare events and exceptions in future research.
Model Figure 2 illustrates our modeling scheme for learning the clinical pathways. Given the time stamps associated with intervention data recorded in the EHR, we assume that each state in the data-driven clinical pathway is separated by at least 1 time unit (eg, day, week, month), and that each state may contain more than 1 type of intervention. For example, it is typical for a CKD patient to have a follow-up visit in the clinician’s office, receive medication prescriptions, and have diagnostic codes assigned to the visit. Our data encoding anticipates such multidimensional and longitudinal features in the data. We assign a unique label for each unique combination of interventions occurring from a visit on the same day, such that patients’ clinical interventions that span multiple categories, such as diagnosis, medication prescription, and encounter type, can be transformed into 1-dimensional pathways, as shown in the top row in Figure 2. Naturally, these interventions are related to one another over time in varying degrees. For instance, interventions that occurred within 6 months of each other may be more strongly correlated than those that occurred within 2 years of each other.
In the context of CKD management, we assume that interventions at visit t+2 are dependent on activities at visit t+1 and t, as shown in the middle row in Figure 2. For analytical tractability, and reflecting actual practice in the management of many health conditions, the time intervals between 2 consecutive visits are categorized as: 1) less than 3 months, 2) greater than 3 but fewer than 6 months, or 3) at least 6 months. These assumptions are practice- and condition-specific,3 but can be readily modified for different settings. Patients’ biochemical conditions, as reflected by their laboratory observations, are assumed to be influenced by the interventions, as shown in the bottom row in Figure 2. For the problem of clinical pathway learning described in this study, our goal is to learn the most probable sequence of clinical interventions given to patients with a particular trajectory of biochemical responses. Similarly, the prediction problem is to infer the most probable imminent interventions in the next state—most importantly, diagnostic codes—for these patients.
We model this treatment process as a hidden Markov model (HMM). HMM is a statistical model with a wide range of applications, such as in speech recognition and RNA sequence analysis.42 It is defined by 5 elements: sequence of hidden states, sequence of observations, state transition probability distribution, observation probability distribution, and initial state distribution.43 HMM is used to represent a process in which a sequence of observations is generated, and each observation is triggered by an underlying process that is hidden to us. For example, given a sequence of a patient’s body temperatures, we may assume that the patient’s health condition is affecting his or her body temperature. Therefore, the sequence of body temperatures form the observations in HMM, and health conditions represent HMM’s sequence of hidden states.
The sequence of hidden states in an HMM has a first-order Markov property, which states that the current state only depends on the previous state.44 Therefore, we regard the middle row in Figure 2 as the sequence of hidden states and the bottom row as the sequence of observations. Parameters of the HMM, such as transition probabilities of hidden states in the Markov chain, are learned from the data using the expectation-maximization (EM) algorithm.43 Given HMM parameters, we can perform both the clinical pathway learning and prediction tasks through HMM decoding, which calculates the sequence of hidden states with the highest probability given the sequence of observations and the probability distribution of the model. Details of the model and algorithm are described further in the eAppendix and prior studies.39
RESULTS Descriptive Statistics
We demonstrate the methodology using a real-world data set of 664 patients, with visits from 2009 to 2013 extracted from the EHR, who suffered from CKD and associated complications. The gender ratio is nearly equal. Over 67% of the patients are aged at least 70 years, and nearly 95% are Caucasian. Components considered as part of clinical pathways and the number of unique patients who had each component in their EHR are listed in Table 1. These components were selected for their relevance in CKD management, per consultation with clinicians, but can be extended to include additional details. All 664 patients had initial diagnoses of CKD stage 3 and hypertension, but not diabetes, and none of the patients had anemia or hyperparathyroidism initially. These patients either progressed to advanced CKD stages and ESRD, or improved to CKD stages 1 and 2. Most of them subsequently developed some of the complications listed in Table 1.
Clustering of Patients
The number of clusters, k, was determined to be 7 using the highest silhouette value (0.189) from hierarchical clustering. Table 2 describes the characteristics of each group in detail, indicating that hierarchical clustering using dLCS was able to divide patients into subgroups that differ on treatment frequency, duration, and outcome at the end of the study period. For example, 95% of the patients in subgroup 5 showed improvement in their conditions at the end of the study period, while none worsened, after being in the clinic for an average of 26.9 months. Subgroup 3 is the largest subgroup, and it also has the smallest average dLCS, suggesting that patients are more similar to one another compared with other subgroups. Subgroup 2, which needs to be investigated further, had a mixture, with 14% of patients who improved and 20% who worsened. The final column in Table 2 lists complications of CKD that the majority of patients suffer from in each group.
Clinical Pathway Learning and Prediction Table 3 summarizes the accuracies associated with predicting the imminent interventions and diagnoses, such as prescription of diuretics and episodes of AKI, and learning the most probable pathways for sample subgroups 3, 4, and 5. We chose these 3 subgroups because of their larger subgroup sizes, and interesting final outcomes at the end of the study. We tested the accuracies using the most common sequence of laboratory observations (LOs) from 3 consecutive visits, and the number of patients who experienced such patterns is listed under the column, “Number of patients who had LOs.” Training and testing were performed through a variant of the leave-one-out cross-validation method.45 Learning and prediction were done with respect to the most common sequence of LO in each subgroup. It is interesting to note that the common biochemical patterns in subgroups 3 and 4 are the same, but the model identified different clinical pathways for these 2 groups, which require further examination. “Pathway with time”/“Pathway without time” measure accuracy of learning the entire pathway, including/not including the actual time duration between 2 visits, respectively. Similarly, “Future visit with time”/“Future visit without time” measure the prediction accuracy for patient’s future interventions, with/without time durations between visits. Each state variable contains information on the presence or absence of 3 encounter types, 19 diagnoses, and 4 drug classes, in addition to 3 different durations between visits. Therefore, the probability of accurate learning and prediction, on a random try, is extremely low compared with the results from our algorithm.
We also examined the false negative and false positive rates in the prediction of an imminent condition such as AKI. We define a false negative to be a case where patients’ CKD stages are worse than predicted, or patients developed AKI, which our methods failed to predict. A false positive is defined as patients’ CKD stages being better than predicted, or prediction of AKI when no AKI developed in reality. We include AKI in this analysis because it is a serious adverse outcome: it often requires hospitalization and can be fatal.46 We were able to obtain false-positive and false-negative rates that are as low as 0%, although this result needs to be validated using a much larger sample. Nevertheless, the learning and prediction algorithms show promise in identifying common pathways of treatments, but these need to be analyzed further to better delineate effective interventions in the various subgroups.
This paper provides a brief overview of machine learning approaches to assist medical decision making, and introduces a methodology, as well as an application that illustrates the development of data-driven clinical pathways through mining of EHR data. This approach may facilitate timely extraction of potential new evidence that could become the basis for new clinical trials, and may also serve as “shared baselines” to be used within a local practice for work flow and population health management.47 Patient-focused applications derived from our research, particularly those that visualize the clinical pathway and provide related patient-oriented recommendations and educational resources, may enhance patients’ understanding of their diseases and treatments, thus facilitating shared decision making.
An important ongoing study is to develop prediction models for other significant outcomes of interest in the management of CKD and its complications. Also, we need to evaluate these data-driven clinical pathways, especially their divergence and rare events, and their predictions with input from clinical professionals. As a growing number of healthcare organizations pilot new care delivery and payment models, such as the accountable care organizations,48 exploring disease trajectories that incorporate the interactions of clinical interventions and their associated outcomes may also provide useful insights on the cost effectiveness of treatments, which organizations can leverage for implementing innovative care delivery practices.
A crucial prerequisite for success in the application of advanced machine learning methods to healthcare delivery is data quality. It is not uncommon for computational scientists to spend significant effort in cleaning EHR data before analysis. In addition, even after months of processing, there are often still missing data and errors, some arising from the mismatch between actual work flows and process assumptions, subjecting the analytical results to bias. Such inefficiency can be minimized by careful observation and understanding of the care delivery context, and planning of the data storage with a range of options available depending on the data size.49 At the same time, methods have been developed, such as imputation and approximate inference algorithms, that can accommodate missing data. For example, in this paper, we used the EM algorithm to infer the parameters of HMM. Furthermore, diversity is innate to most healthcare data, and we found it to be one of the biggest challenges in accurately inferring clinical pathways, requiring large amounts of data and robust methods for analysis and inference. In this paper, we examined encounter type, diagnosis, medication prescriptions, and biochemical measurements, but our data representation is flexible with regard to the number of clinical factors of interest. Therefore, when sufficient curated data becomes available, factors such as medical expenses and behavioral information can also be incorporated to enrich the learned pathways and personalized predictions of health and cost outcomes.
This paper presents additional promising evidence of the potential of machine learning applications for clinical decision making. We develop and demonstrate a methodology to facilitate more targeted management of patients with complex chronic conditions using data-driven clinical pathways. Clinical pathways are learned from a healthcare organization’s EHR data by summarizing multidimensional clinical history as chronologically organized sequences, capturing information on the co-progression of encounter types, diagnoses, medications, and biochemical measurements. Further, we link clinical pathways to a few outcomes within subgroups of patients with reasonable accuracy using hierarchical clustering and HMM. Applying our methodology to relevant EHR data on 664 patients with CKD stage 3 and hypertension, we identify clinical pathways that may be compared with current CPG recommendations in future studies, and contribute to the development of shared-baseline within hospitals. These methods and broad findings from EHR data are generalizable and can be adapted to other clinical conditions to support efficient review of treatments and outcomes and to aid clinical professionals and patients in making more informed treatment and management decisions.
The authors are very grateful to the forward-thinking physicians and staff of the community nephrology practice, Teredesai, McCann & Associates, PC, in Western Pennsylvania, who generously provided detailed, de-identified data from their 20-year electronic health record for this study. We particularly thank Pradip Teredesai, MD, FACP; Qizhi Xie, MD, PhD; Nirav Patel, MD; and staff members Linda Smith and Audra Barletta, who gave us important clinical and technical information about the data and the key characteristics of CKD, AKI, and their treatments. This study was designated as Exempt by the Institutional Review Board at Carnegie Mellon University.