• Center on Health Equity and Access
  • Clinical
  • Health Care Cost
  • Health Care Delivery
  • Insurance
  • Policy
  • Technology
  • Value-Based Care

NLP Can Extract First Seizure Onset Information From Patient Discharge Summaries


A novel natural language processing pipeline (NLP) may be able to help expedite the time-consuming process of gleaning this information from summaries.

Researchers have developed a rule-based natural language processing pipeline (NLP) that can automatically extract the temporal information of a patient’s first seizure onset from epilepsy monitoring unit (EMU) discharge summaries.

Although early onset of seizure activity can be a risk factor for sudden unexpected death in epilepsy (SUDEP), this information is often documented as clinical narratives in EMU summaries and manually extracting it is time consuming and labor intensive, the researchers explain in AMIA Joint Summits on Translational Science.

Around 65 million people worldwide live with epilepsy, including 3.4 million in the United States. SUDEP is the most common cause of dying from seizures in these patients.

To better elucidate the mechanism of SUDEP, the National Institute for Neurological Disorders and Stroke funded the Center for SUDEP Research (CSR), which had enrolled 2739 patients from 7 sites as of March 2022.

“Around 10% (313 out of 3128) of the CSR EMU discharge summaries do not have first seizure onset date explicitly documented using a standard date format. Instead, such temporal information is embedded in clinical narratives,” the authors said.

To speed up extracting first seizure onset information from these summaries, the investigators developed their NLP by randomly picking 300 discharge summaries to form the development set. Based on this, they constructed a collection of 4 rules to “extract temporal expressions related to the first seizure onset from clinical free text.”

They then applied these constructed rules on another 200 unseen summaries to evaluate the efficacy of the approach and compared results with manual evaluations of the summaries carried out by a domain expert.

Researchers found their extraction pipeline had a precision of 0.75, recall of 0.651, and an F1-score of 0.697.

“This is an encouraging initial result, which will allow us to gain insights into potentially better-performing approaches,” they wrote.

Machine learning and deep learning have been used for temporal relation extraction, while hybrid approaches have combined rule-based and machine learning components.

In the current analysis, the investigators applied their approach on 2 test sets. Although their method generally had a good performance, it did differ between these sets. They hypothesized this could be due in part to “the difference between composition of the 2 sets in terms of the types of first seizure onset temporal expressions.”

During an analysis of failure cases, they found the rule-based approach could not identify correct seizure onset temporal expressions when the model did not recognize certain temporal expressions written in the summaries. In other cases, it did not recognize seizure-related events.

The authors also note their rules can only be applied within sentences, and they add an interesting future direction of the work could be to extract first seizure onset by using information spread across multiple sentences.

In the current study, researchers only focused on the extraction of first seizure onset date. However, other temporal factors, like seizure duration and frequency, could be important when assessing SUDEP risk. In the future, they hope to perform a comprehensive study to extract all these temporal expressions from clinical narratives. They also plan to investigate automatically constructing patient seizure timelines based on discharge summaries and other data.

“To the best of our knowledge, this is the first instance where NLP has been applied to extract first seizure onset–related temporal expressions from clinical narratives,” the researchers concluded. “This work is also a first step towards automatically constructing patient timelines for seizure-related events from patient discharge summaries.”


Tao S, Abeysinghe R, De La Esperanza BT, Lhatoo S, Zhang G-Q, Cui L. Extracting temporal expressions of first seizure onset from epilepsy patient discharge summaries. AMIA Jt Summits Transl Sci Proc. Published online June 16, 2023.

Related Videos
Andrew Srisuwananukorn, MD
dr sara horst
Mike Koroscik, MBA, MHA, Allina Health and the Allina Health Cancer Institute
Brian Mullen, PhD, head of innovation & product, The Clinic by Cleveland Clinic
Steven Deitelzweig, MD, system chairman of hospital medicine at Ochsner Clinical School, professor of medicine at the University of Queensland
dr sara horst
dr erin gillaspie
Matthew Crowley, MD, MHS, associate professor of medicine, Duke University School of Medicine.
Jennifer Sturgill, DO, Central Ohio Primary Care
Donna Fitzsimons
Related Content
© 2023 MJH Life Sciences
All rights reserved.