AI tools cannot measure how disinformation affects humans, only how it affects other AI

What happened

Companies use AI tools to check if generated disinformation is dangerous. But it turns out these tools mostly agree with each other, not with actual human readers.

Why it matters

Companies use AI tools to decide if generated content is dangerous. This paper shows those tools are bad at predicting how real people react to disinformation. They might be filtering for the wrong things, or missing what actually works on human minds.

The signal

Watch whether social media platforms or content moderation services change their methods for evaluating AI-generated disinformation, or if they keep using AI judges.