AI disinformation tools miss harm for hundreds of millions of English speakers

What happened

New research shows that AI tools built to spot harmful content, like disinformation, perform much worse for people who speak English dialects other than Standard American English. This means content moderation systems on social media are likely failing to protect, or unfairly flagging, hundreds of millions of users worldwide.

Why it matters

Content moderation systems have always struggled with nuance. This paper quantifies a major blind spot: these systems are built for one specific version of English. This leaves vast populations vulnerable or misidentified. It means social media companies and governments deploying these tools are applying different safety standards to different groups of English speakers. The difference is based purely on their dialect.

The signal

Watch for social media platforms to update their content moderation models using the new DIA-HARM benchmark, or for regulators to demand better performance across dialects.