The world is being quietly rearranged by people who write very long documents.


The title they went with The Hidden Puppet Master: Predicting Human Belief Change in Manipulative LLM Dialogues Noisy translates that to

AI safety tools can't tell if a chatbot is actually changing your mind


Researchers built a new way to measure if a chatbot is actually changing a person's mind. It turns out current AI safety tools cannot predict when this happens.
AI developers have been building safety tools that check for manipulative language. This paper shows those tools miss the actual problem: whether the chatbot makes someone believe something new. It means current safety checks are not measuring real-world impact on users.
Watch whether AI companies start using this new dataset and task to evaluate their models, or if they stick to older methods.

If you insist
Read the original →