
Challenges in Implementing AI in Sleep Medicine Draw Focus at Sleep Conference
Key Takeaways
- Risk is lowest with manual PSG scoring and clinician adjudication, increases with AI-scored PSG plus clinician oversight, and peaks when AI independently diagnoses and treats sleep disorders.
- Dataset shift and uncertain PSG ground truth complicate validation, and models trained in high-income settings may underperform in low-resource environments, amplifying misdiagnosis and care gaps.
The use of artificial intelligence (AI) in sleep medicine could cause pitfalls on the practical and training ends, affecting patient care.
The use of artificial intelligence (AI) in medicine has been increasing, as clinicians are able to use it to identify patterns in testing or to identify anomalies. However, as use of AI increases, clinicians should be wary of potential pitfalls that come with its use. A session held during the
Ethical, Practical Consequences of Deployment of AI
Aatif Husain, MD, MBA, director of the neurodiagnostic lab/EP-IOM lab in the Department of Neurology at Duke University School of Medicine, spoke with the gathered audience about the safe implementation of AI in sleep medicine. When it comes to implementing AI in low-resource settings in particular, there are challenges that come with its use.
“When new AI instruments come out, there’s always a sense that this is really valuable, this is fantastic, it saves us so much time,” said Husain. “That’s what mindset I want you to have as we talk about the issues and the pitfalls that we’re going to encounter.”
He explained that although it’s tempting to democratize these machines so that access to services can get to places that don’t have them now, he warned against this line of thinking, as experts still need to be evaluating the models used and confirming their findings based on their own knowledge. Having AI score for polysomnography (PSG) can be both a dream and a nightmare, he said.
The risk levels for low- and middle-income countries are lowest when no AI is involved, PSG testing is scored manually, and humans make the final decision. Moderate risk can be present when AI scores the PSG, with the human making the final decision, and highest risk is when AI
Infrastructure is also a challenge to implementing these AI models, with Husain pointing out that data centers that power these AI models are becoming issues in the US and could be an even bigger problem in low- and middle-income countries that have fewer electric resources. The computer hardware may also not be available in these communities. “You can start this as a project, as a grant, to bring this in; the background is going to run out. Do the governments in these countries have the sustainability? Do they have the finances to sustain these projects long term?” Husain questioned.
Lastly, Husain pointed to an erosion of empathy, as care is more than simply giving a diagnosis. Having AI as a middleman in the diagnostic process could lead to a lack of trust from the patient to the provider, especially if the primary care provider cannot interpret the AI results accurately. The potential shrinking of the workforce is also noted by Husain, as AI being marketed as a means of solving specialist shortages can lead to overreliance on the tool and the
“What we must worry about is adding lack of data from [low- and middle-income countries] that make up the algorithms that help teach the algorithms. Because, if that happens, and we have got inaccurate AI algorithms that will lead to misdiagnosis in these areas, the perceived AI value will lead to false efficiencies and fewer jobs,” Husain concluded. “…And in fact may worsen health outcomes and worsen the care gaps in low- and middle-income countries rather than bridging those gaps with AI.”
Deskilling Due to AI Could Be a Challenge Moving Forward
Margarita Oks, MD, FAACP, FAASM, an associate professor of medicine at the Donald and Barbara Zucker School of Medicine at Hofstra Northwell, also warned against overreliance on AI, specifically focusing on trainees as they are educated in fellowship and residency.
“Deskilling is probably one of the main issues that we have, and it’s the gradual erosion of critical thinking over time, because we’re relying on AI,” she explained. “This leads to a huge detriment in our ability to diagnose, to problem solve, to recognize patterns that are essential for the appropriate patient care, and also for us to comply with known standards of care.”
AI can also lead to the potential of never developing skills because of a reliance on AI or an inability to assess AI, which can lead to eventual acceptance of AI hallucinations as fact. Oks presented a study that found that gastroenterologists who started performing colonoscopies with AI assistance had a
AI, Oks said, is not perfect. It has several limitations when implemented in clinical care, including evaluating edge cases and looking at poor-quality signals. The biggest threat to sleep medicine is residents and other trainees relying on this AI and not learning, such as relying on AI to score the 25 PSGs over 1 year of sleep fellowship that all doctors going into sleep medicine require. Relying on AI can also deskill those who have learned, as those skills are not being properly built on and misdiagnoses are possible.
“Just like we ourselves have to learn the language of AI…it is on us to ensure that our graduates, the people that we’re having in front of us, are able to use this technology in a responsible way,” said Oks.
By teaching a resident how to critically appraise the AI, it becomes second nature and becomes easier to have a healthier relationship with the AI, said Oks. She also suggested that AI should be added to the curriculum to get residents and trainees training with it, just as they would with any new guidelines that come out.
“We have to start looking at AI as not an answer but as our copilot…as an assistant, as an entity that can help us but will not lead us,” Oks concluded.
References
- Cestonaro C, Delicati A, Marcante B, Caenazzo L, Tozzo P. Defining medical liability when artificial intelligence is applied on diagnostic algorithms: a systematic review. Front Med (Lausanne). 2023;10:1305756. doi:10.3389/fmed.2023.1305756
- Musacchio N, Giancaterini A, Guaita G, et al. Artificial intelligence and big data in diabetes care: a position statement of the Italian Association of Medical Diabetologists. J Med Internet Res. 2020;22(6):e16922. doi:10.2196/16922
- Budzyń K, Romańczyk M, Kitala D, et al. Endoscopist deskilling risk after exposure to artificial intelligence in colonoscopy: a multicentre, observational study. Lancet Gastroenterol Hepatol. 2025;10(10):896-903. doi:10.1016/s2468-1253(25)00133-5




