Laravel

OpenAI's GPT-5.4 Outperforms Physicians in Medical Diagnosis Benchmark

April 29, 2026 · 2:24 PM

OpenAI's latest iteration of its large language model, GPT-5.4, has achieved a groundbreaking score of 59.0 on the HealthBench Pro medical diagnostic test, surpassing the average performance of human physicians. The model, specifically fine-tuned for clinical applications in a new tool called ChatGPT for Clinicians, demonstrates the potential of AI to assist in medical diagnosis.

'This marks a significant milestone in AI-assisted healthcare,' said an OpenAI spokesperson.

The tool is currently available for free to verified U.S. healthcare workers, offering them a powerful resource to enhance diagnostic accuracy and efficiency. While the AI excels in standardized tests, experts caution that real-world clinical application requires careful validation and integration into existing workflows.

The achievement has sparked discussions about the role of AI in medicine, with some praising its potential to reduce diagnostic errors and others emphasizing the need for human oversight. As AI continues to advance, the medical community is closely watching how these tools will be adopted and regulated.

OpenAI's GPT-5.4 Outperforms Physicians in Medical Diagnosis Benchmark

We Care About Your Privacy

How and why we process data