The world is being quietly rearranged by people who write very long documents.


The title they went with Using LLM-as-a-Judge/Jury to Advance Scalable, Clinically-Validated Safety Evaluations of Model Responses to Users Demonstrating Psychosis Noisy translates that to

AI can now test mental health chatbots for safety, using AI.


Researchers have built a way to use AI to check if other AIs are safe for people with psychosis. This new method uses AI judges to evaluate AI responses, and it works almost as well as human experts.
For years, the worry has been that AI chatbots used for mental health support could harm vulnerable users, especially those with psychosis, by reinforcing delusions. Existing safety tests were slow and expensive. This new method uses AI to do the safety checks, making it faster and cheaper to test these tools at scale. This means companies can now more easily check if their mental health AIs are safe before releasing them.
Watch whether mental health AI companies start publishing their safety test results using this AI-judge method.

If you insist
Read the original →