ChatGPT Effectively Answers Patient Questions on CRC Screening

Author(s):

Key Takeaways

ChatGPT-4o effectively answered CRC screening questions, with high ratings for accuracy, completeness, and comprehensibility from experts and patients.
The AI demonstrated consistent performance, with an 86.8% similarity in responses across sessions, indicating reliability in addressing patient queries.
Limitations include a small sample size, evaluation of only one AI model, and language constraints, suggesting the need for broader studies.
AI tools can enhance patient engagement and early diagnosis but should not replace professional medical advice, highlighting the importance of healthcare provider interactions.

Study findings show ChatGPT-4o provides accurate, complete, and trustworthy responses to patient-generated colorectal cancer (CRC) screening questions.

Artificial intelligence (AI) help patients better understand cancer screening, as ChatGPT-4o consistently delivered accurate and understandable responses to real questions posed by individuals eligible for colorectal cancer (CRC) screening, according to one study.¹By evaluating AI-generated answers through both expert review and patient feedback, researchers found strong support for ChatGPT’s role in promoting informed participation in CRC screening programs.

ChatGPT | Image credit: irissca - stock.adobe.com

Study shows ChatGPT-4o provides accurate, complete, and trustworthy responses to patient-generated colorectal cancer (CRC) screening questions. | Image credit: irissca - stock.adobe.com

The cohort study is published in Endoscopy International Open.

“In our study, despite a slight variability in responses, we found that ChatGPT-4o is still effective in providing accurate, complete, and understandable answers,” wrote the researchers of the study. “Moreover, patients provided positive feedback about the completeness, comprehensibility, and trustworthiness of the responses, indicating their favorable perception of tool performance.”

AI tools have shown promise in reshaping how CRC is detected and understood, offering promising avenues to improve early diagnosis and patient engagement.² As highlighted in recent research presented at the ASCO Gastrointestinal Cancers Symposium, the C the Signs model was able to identify individuals at high risk of CRC up to 5 years earlier than traditional methods by analyzing symptom patterns in electronic medical records. These findings demonstrate the growing potential of AI—not just in predicting cancer risk, but in empowering patients with accessible, timely, and trustworthy information to encourage earlier detection and intervention.

In this study, 10 consecutive individuals aged 50 to 69 years who were eligible for the Italian national CRC screening program but not currently participating were recruited for participation.¹Each participant was presented with 4 standardized scenarios reflecting common concerns about CRC screening and asked to generate 1 question per scenario to seek further information. These patient-generated questions were submitted to ChatGPT-4o in 2 separate sessions to assess consistency and response quality. A panel of 5 senior experts in CRC screening independently evaluated each AI-generated response using a 5-point Likert scale across 3 domains: accuracy, completeness, and comprehensibility.

Additionally, the same 10 participants who created the questions reviewed the responses and rated them as complete, understandable, and trustworthy using a dichotomous (yes/no) scale. All evaluations were conducted independently, with raters blinded to each other's assessments.

The expert panel rated ChatGPT-4o’s responses with mean (SD) scores of 4.1 (1.0) for accuracy, 4.2 (1.0) for completeness, and 4.3 (1.0) for comprehensibility, indicating overall high-quality performance across key evaluation domains. Patient assessments were similarly positive, with 97.5% of responses rated as complete, 95% as understandable, and 100% as trustworthy. Notably, the consistency of ChatGPT’s answers over time was confirmed by an 86.8% similarity between the 2 response sessions, suggesting reliable and reproducible performance when addressing patient-generated CRC screening questions.

However, the researchers noted some limitations. First, it evaluated only one large language model to maintain consistency with prior research, which limited comparisons with other tools. Second, the small sample size may have affected the strength of the findings. Finally, as the study was conducted entirely in Italian, results may not generalize to other languages or cultural contexts. Therefore, the researchers believe that broader studies are needed to confirm these findings across platforms and populations.

Despite these limitations, the researchers believe the study suggests that ChatGPT showed good performances in answering CRC screening questions, even when used directly by patients.

“Nevertheless, it is important to emphasize that this technology is not intended to replace professional medical advice,” wrote the researchers. “Most patients require face-to-face interactions with health care providers to discuss their concerns and receive necessary explanations. In addition, consulting a doctor is always needed to address complex issues involving health conditions and medication management and to provide personalized health care solutions.”

References

1. Maida M, Mori Y, Fuccio L, et al. Exploring ChatGPT effectiveness in addressing direct patient queries on colorectal cancer screening. Endosc Int Open. 2025;13:a25689416. doi:10.1055/a-2568-9416

2. Steinzor P. Unlocking early colorectal cancer detection with artificial intelligence. AJMC^®. January 23, 2025. Accessed May 16, 2025. https://www.ajmc.com/view/unlocking-early-colorectal-cancer-detection-with-artificial-intelligence

Stay ahead of policy, cost, and value—subscribe to AJMC for expert insights at the intersection of clinical care and health economics.

Subscribe Now!

ChatGPT Effectively Answers Patient Questions on CRC Screening

Key Takeaways

Newsletter