Public Health

Comprehensive Summary

This study investigates the use of large language model (LLM)-based clinical decision support systems (CDSS) to detect prescribing errors and improve medication safety in acute care settings. Researchers created forty complex fictional patient case vignettes containing 91 prescribing errors and drug-related problems (DRPs). Pharmacists of varying experience levels reviewed these cases under three conditions: physician alone, physician with LLM support using a co-pilot model, and LLM alone, and assessed the accuracy, precision, recall, and reproducibility of the LLMs. The Claude 3.5 Sonnet Model outperformed the other LLMs tested (Gemini Flash, OpenAI’s GPT-4, Mistral, and Meta’s LLaMa) in terms of recall, precision, F1 score, and overall accuracy, and was chosen for implementation in the LLM-CDSS. In co-pilot mode, pharmacists supported by the LLM-CDSS showed a 32.6% improvement in accuracy compared to pharmacists alone, and higher performance across recall, precision, and F1 score. The co-pilot setup also identified 66% of DRPs with potential for serious harm, compared to 52% by LLMs alone. However, the model’s performance declined in detecting inappropriate dosage regimens, suggesting that current LLMs may struggle with region-specific prescribing nuances. The study highlights that LLM-based CDSSs can be effectively implemented into human clinical decision-making, especially when used collaboratively with healthcare professionals. However, challenges still remain around explainability, prompt design, integration into workflows, and ensuring model reliability across diverse clinical scenarios.

Outcomes and Implications

Medication errors are a leading cause of preventable harm in healthcare, and current rule-based clinical decision support systems often suffer from alert fatigue and limited effectiveness. This study addresses a critical gap by exploring how LLMs, with their ability to process complex clinical information, could enhance prescription safety and reduce drug-related problems. This study also demonstrates that integrating LLMs in a co-pilot mode where physicians and AI models work alongside each other can significantly improve the identification of serious prescribing errors. This suggests clear potential for real-world use in implementing these LLMs into clinical pharmacy practice. While promising, however, the authors also emphasize that further validation with larger, more diverse datasets is necessary before widespread clinical deployment. They also note the need for tailored implementation strategies to address issues such as trust, explainability, and workflow integration, indicating that clinical adoption is still in an early, exploratory phase. Given current limitations, such as inconsistent performance in complex dosing scenarios and the need for better integration into clinical workflows, LLM-based CDSS tools are not yet ready for immediate widespread implementation, but still offer a promising path forward. Future research should focus on refining knowledge retrieval techniques, improving model transparency, and designing effective clinical-AI interfaces to ensure safe and scalable adoption of these LLMs into clinical practice.

Our mission is to

Connect medicine with AI innovation.

No spam. Only the latest AI breakthroughs, simplified and relevant to your field.

Our mission is to

Connect medicine with AI innovation.

No spam. Only the latest AI breakthroughs, simplified and relevant to your field.

Our mission is to

Connect medicine with AI innovation.

No spam. Only the latest AI breakthroughs, simplified and relevant to your field.

AIIM Research

Articles

© 2025 AIIM. Created by AIIM IT Team

AIIM Research

Articles

© 2025 AIIM. Created by AIIM IT Team

AIIM Research

Articles

© 2025 AIIM. Created by AIIM IT Team