Medical AI changes its mind if you ask the same question in a different tone

What happened

Large language models used for medical questions give different answers depending on how the question is phrased, even when given the same source material. This means patients asking for medical advice might get contradictory information from the same AI, just by changing a few words.

Why it matters

People are already using AI for health questions. This paper shows that the AI's answer can flip from "yes, it works" to "no, it doesn't" based on subtle wording. This makes it impossible to trust AI for critical medical advice without robust testing for phrasing sensitivity.

The signal

Watch for medical AI developers to publish new evaluation metrics that specifically test for phrasing sensitivity and consistency.