NOW LET US – AI RAG SaaS Studio TP.HCM
NOW LET US
Digital Product Studio
Back to news
AI-FRONTIER...3 min read

Improving health intelligence in ChatGPT

Share
NOW LET US Article – Improving health intelligence in ChatGPT

OpenAI has introduced significant improvements to ChatGPT's health intelligence, powered by GPT-5.5 Instant. Developed in collaboration with a global network of physicians, the updated model has reduced factual errors by 71% and offers safer, more reliable health guidance for users worldwide.

Health is one of the most meaningful ways people use ChatGPT. Every week, more than 230 million people turn to ChatGPT for help with health and wellness questions: making sense of health information, understanding lab results, preparing for appointments, navigating insurance, building healthier habits, and figuring out what to ask next.

With GPT‑5.5 Instant, we’re seeing a substantial step forward in health, with improvements in recognizing when urgent care may be needed, asking for relevant context, explaining uncertainty, and making complex information easier to understand. On our most challenging health evaluations, GPT‑5.5 Instant now performs at a level comparable to our frontier Thinking models. Because it is available to all free users in ChatGPT, more people can benefit from these improvements.

That progress reflects both advances in model capabilities and the physician-led work behind our health evaluations. Across our efforts, a global network of physicians helps define what “good” looks like in real-world health situations by reviewing example model responses, describing ideal behavior, and identifying failure modes. Working with physicians gives us a way to measure progress in health and improve how ChatGPT responds over time.

In health, progress means delivering responses that are accurate, understandable, and grounded in good judgment: recognizing when more context is needed, explaining uncertainty without overstating confidence, and helping people understand when to seek care.

To measure that progress, we use health-specific evaluations, including HealthBench and HealthBench Professional. These evaluations use realistic health conversations and physician-written rubrics to assess qualities like accuracy, safety, communication, context awareness, completeness, and appropriate escalation.

As another comparison, we asked physicians to write responses for representative health conversations, with unlimited time and access to the internet (but not AI). A separate panel of physicians then compared these physician responses with model responses over time, reviewing qualities that matter in real interactions, including accuracy, communication, completeness, instruction following, and health decision helpfulness, across 3,500 reviewed responses.

Physicians rated GPT‑5.5 Instant responses as having fewer failure modes than those from older models and physicians. For example, GPT‑5.5 Instant had fewer instances of not tailoring to local healthcare context, missing red flags or referral to care, or failing to seek additional context from the user when needed than both older models and physicians.

Given the scale of usage of our models in health, another way to understand recent model improvements is to measure production traffic. We use privacy-preserving monitors on production traffic to track possible factuality issues in health responses. Based on a comparison of recent production traffic in health—billions of messages a week—the rate of responses with at least one flagged factuality issue has fallen by 71% in the last two months.

Comparing responses from models on real-world health questions over time shows how ChatGPT has improved in ways that matter for health: recognizing when a situation may need urgent attention, handling uncertainty with better judgment, and giving people clearer, more useful guidance about what to do next.

GPT-5.5

This progress is shaped by physicians who help us define, measure, and improve health responses in ChatGPT.

OpenAI works with a global network of more than 260 physicians across 60 countries, 49 languages, and 26 medical specialties. Their feedback informs how ChatGPT responds to health questions across a wide range of scenarios, from everyday wellness questions to more complex clinical situations.

Physicians review example model responses and assess whether they are accurate, clear, complete, appropriately cautious, and useful. They help identify where a response may miss important context, where it may sound too confident, where it should be clearer about next steps, or more directly encourage someone to seek medical care.

To date, physicians have reviewed more than 700,000 example model responses that reflect how patients and clinicians use ChatGPT in the real world. Every few minutes, a physician reviews a new response. Their feedback becomes rubrics and evaluation criteria that help researchers measure whether responses are accurate, safe, clear, complete, appropriately cautious, and useful in real-world health situations. This gives us a clearer way to see where models are getting better and where they still need work.

This work also supports OpenAI’s broader work in health, including tools built for healthcare, such as ChatGPT for Clinicians and OpenAI for Healthcare, which support medical professionals with tasks like documentation, research, and care delivery.

Improving human health will be one of the most personal, tangible impacts of AGI. As our models continue to improve, our goal is to make ChatGPT more accurate, more useful, and more impactful in those moments — and to keep bringing that progress to more people.

© 2026 Now Let Us. All rights reserved.

Source: OpenAI News

Advertisement
Ad slot ready: 5887729102

More in this category

NOW LET US Related – The film about Sam Altman has been dropped by Amazon MGM

ai-frontier

The film about Sam Altman has been dropped by Amazon MGM

Amazon MGM has reportedly dropped 'Artificial', a film directed by Luca Guadagnino about the dramatic firing and rehiring of OpenAI CEO Sam Altman in 2023.

NOW LET US Related – A startup claims it broke through a bottleneck that’s holding back LLMs

ai-frontier

A startup claims it broke through a bottleneck that’s holding back LLMs

Miami-based AI startup Subquadratic claims its new model, SubQ, has solved a decade-long mathematical bottleneck in LLMs by replacing dense attention with a highly efficient sparse attention mechanism. Independent testing by Appen suggests the technology could drastically cut costs and boost processing speeds.

NOW LET US Related – Barret Zoph is out at OpenAI again after just five months

ai-frontier

Barret Zoph is out at OpenAI again after just five months

Five months after returning to OpenAI to lead its enterprise AI sales, Barret Zoph has departed the company once again, following a brief stint at Mira Murati's rival startup.

NOW LET US Related – How the Peter Thiel-Linked Dialog Club Secretly Ranks Its Members

ai-frontier

How the Peter Thiel-Linked Dialog Club Secretly Ranks Its Members

Leaked internal data reveals that Dialog, a private club cofounded by Peter Thiel, secretly grades and ranks its prominent members using algorithms, wealth, and fame to dictate event pricing, seating, and membership status.

NOW LET US Related – The White House Is Making Up Its Rules for AI in Real Time

ai-frontier

The White House Is Making Up Its Rules for AI in Real Time

The Trump administration's sudden crackdown on Anthropic's advanced AI models reveals an ad-hoc, "Wild West" approach to regulation. As the White House makes up rules in real time, other tech giants are forced to adapt to an unspoken licensing regime.

NOW LET US Related – Meta’s AI Workers Are Revolting, Peter Thiel’s Secret Society, and SBF’s Plea to Trump

ai-frontier

Meta’s AI Workers Are Revolting, Peter Thiel’s Secret Society, and SBF’s Plea to Trump

This week on Uncanny Valley, hosts Zoë Schiffer and Brian Barrett discuss the internal meltdown at Meta over its aggressive AI restructuring, the leaked member list of Peter Thiel's secretive 'Dialog' society, and Sam Bankman-Fried's active campaign for a presidential pardon.

EXPLORE TOPICS

Discover All Categories

Deep dive into the specific technology sectors that matter most to you.