
Simple Retrieval Step Boosts AI Accuracy in Assigning ICD Codes
Researchers found that even smaller open artificial intelligence (AI) models outperformed clinicians, supporting automation of International Classification of Disease (ICD) coding.
A small adjustment to how
The study, published in
“Our previous study showed that even the most advanced AI could produce the wrong codes, sometimes nonsensical ones, when left to guess,” Eyal Klang, MD, study author, associate professor of medicine, and chief of generative AI at Mount Sinai’s Icahn School of Medicine, said in a
Testing AI Against Physicians
The team analyzed 500 emergency department (ED) visits across Mount Sinai hospitals, with each physician note processed by 9 different AI models.1
The models first generated a plain-language description of the diagnosis. Then, a retrieval system matched each description to 10 similar ICD entries drawn from a database of more than 1 million hospital records, including information on how often those diagnoses occur. The model then used that retrieved information to select the most accurate ICD code.
To validate performance, emergency physicians and 2 independent AI systems reviewed the coding results without knowing whether they came from clinicians or AI. Across the board, retrieval-enhanced models performed better than those without the extra step; in many cases, they surpassed physician-assigned codes.2 Notably, smaller open-source models performed nearly as well as larger commercial systems when given access to the lookup feature—a finding that could have important implications for affordability and scalability.
Cutting Administrative Burden
In the US, physicians spend hours each week entering ICD codes into electronic health records (EHRs) for purposes ranging from clinical documentation to reimbursement. According to one study, this can take up
“This is about smarter support, not automation for automation’s sake,” said Girish N. Nadkarni, MD, MPH, study author and professor of medicine at Mount Sinai’s Icahn School of Medicine. Nadkarni also serves as chair of the Windreich Department of Artificial Intelligence and Human Health, director of the Hasso Plattner Institute for Digital Health, and chief AI officer at Mount Sinai. “If we can cut the time our physicians spend on coding, reduce billing errors, and improve the quality of our data, all with an affordable and transparent system, that’s a big win for patients and providers alike.”
The authors emphasized that the retrieval-enhanced AI is intended to support human oversight, not replace it. The study tested only primary diagnosis codes for patients discharged from the ED, and the system is not yet approved for billing. However, the researchers noted that immediate applications could include suggesting codes within EHRs or flagging potential mistakes before bills are submitted.
Mount Sinai is currently piloting the retrieval method within its EHR system, with plans to expand beyond primary codes to include secondary and procedural coding. According to David L. Reich, MD, chief clinical officer of the Mount Sinai Health System and president of The Mount Sinai Hospital, the “big picture” is how this AI strategy can transform patient care by reducing administrative burden.
“Using AI in this way improves our ability to provide attentive and compassionate care by spending more time with patients,” he said. “This strengthens the foundation of hospitals and health systems everywhere.”
References
- Klang E, Tessler I, Apakama DU, et al. Assessing retrieval-augmented large language models for medical coding. NEJM AI. 2025;2(10). doi:10.1056/AIcs2401161
- Adding a lookup step makes AI better at assigning medical diagnosis codes. News release. Mount Sinai Health System. September 25, 2025. Accessed September 25, 2025.
https://www.newswise.com/articles/adding-a-lookup-step-makes-ai-better-at-assigning-medical-diagnosis-codes - Payerchin R. Physicians spend 4.5 hours a day on electronic health records. Medical Economics®. April 21, 2022. Accessed September 25, 2025.
https://www.medicaleconomics.com/view/physicians-spend-4-5-hours-a-day-on-electronic-health-records
Newsletter
Stay ahead of policy, cost, and value—subscribe to AJMC for expert insights at the intersection of clinical care and health economics.