Public Health

Comprehensive Summary

This retrospective study evaluated how effectively ChatGPT could generate multidisciplinary treatment (MDT) recommendations for ICU patients compared to physician-led MDT discussions. Researchers collected 64 anonymized cases from Shanghai East Hospital, translated the clinical information into English, and used these as prompts for ChatGPT. Two senior physicians then blindly scored both ChatGPT’s responses and the original physician MDT notes across five categories: comprehensiveness, accuracy, feasibility, safety, and efficiency. The results showed that ChatGPT achieved a median score of 41.0/50, which was significantly lower than the physicians’ 43.5/50. While ChatGPT outperformed physicians in comprehensiveness by covering more disciplines and suggesting a broader range of considerations, it consistently lagged behind in accuracy, feasibility, and efficiency, meaning its suggestions were often less precise, less practical, and more time-consuming. Safety scores showed no significant difference between ChatGPT and the physicians. Overall, the findings suggest that ChatGPT can provide broad, structured insights but remains limited in clinical reliability without human oversight.

Outcomes and Implications

The findings of this study have important implications for public health and clinical practice. Although ChatGPT demonstrates potential as a supportive tool for generating comprehensive and structured treatment discussions, its deficits in accuracy and feasibility highlight the risks of relying on AI for complex decision-making in healthcare. For clinicians, ChatGPT could serve as a supplementary resource, particularly for ensuring no critical specialty is overlooked, but its recommendations require careful human review to avoid errors that could compromise patient safety. For health systems and policymakers, the results underscore the importance of keeping AI use strictly within supportive, rather than autonomous, roles until significant improvements are made. From a research and development perspective, these findings point to the need for fine-tuning large language models for medical applications, incorporating multimodal clinical data, and conducting prospective studies with standardized evaluation frameworks. Ultimately, while ChatGPT may enhance efficiency and comprehensiveness in MDT workflows, its integration must be accompanied by rigorous safeguards and human oversight to ensure safe and effective patient care.

Our mission is to

Connect medicine with AI innovation.

No spam. Only the latest AI breakthroughs, simplified and relevant to your field.

Our mission is to

Connect medicine with AI innovation.

No spam. Only the latest AI breakthroughs, simplified and relevant to your field.

Our mission is to

Connect medicine with AI innovation.

No spam. Only the latest AI breakthroughs, simplified and relevant to your field.

AIIM Research

Articles

© 2025 AIIM. Created by AIIM IT Team

AIIM Research

Articles

© 2025 AIIM. Created by AIIM IT Team

AIIM Research

Articles

© 2025 AIIM. Created by AIIM IT Team