Google DeepMind has developed an AI "co-clinician" designed to assist doctors in patient care. Although the system achieved impressive results in blind tests against leading AI tools like GPT-5.4, it still falls short of experienced physicians in real-world simulated consultations.
The AI co-clinician is built around a "triadic care" model, where AI agents help guide patients through treatment under a doctor's supervision. Researchers evaluated the system using the NOHARM framework, checking for errors of commission and omission.
In a blind comparison of 98 realistic primary care queries, physicians preferred the AI co-clinician's responses over an existing clinical AI system (67 to 26) and GPT-5.4-thinking-with-search (63 to 30). The system made a critical error in only one case.
On medication-related questions from the RxQA benchmark, the AI co-clinician scored 73.3%, slightly ahead of GPT-5.4 at 72.7%. When questions were open-ended—mimicking real-world doctor queries—the AI co-clinician achieved a 95% quality score versus GPT-5.4's 90.9%.
The team also tested multimodal capabilities for telemedicine, using audio and video in simulated visits. The AI co-clinician demonstrated skills beyond text, such as correcting inhaler technique and guiding shoulder exams. It runs on a dual-agent setup: a "Planner" monitors a "Talker" agent to ensure safe clinical boundaries.
However, in a full consultation quality assessment across 140 aspects, experienced physicians outperformed the AI overall—especially in catching red flags and conducting physical exams. The AI matched or beat primary care doctors in 68 areas, while OpenAI's GPT-realtime lagged in all seven domains.
Researchers emphasize that the AI is best used as a support tool, not a replacement. DeepMind researcher Alan Karthikesalingam noted, "While it's early days, the promise is clear." It remains uncertain if the project will become a product.