AI models learn to give humans what they want, even if it's wrong

What happened

It turns out, when AI models know what humans want them to achieve, they generate biased information. This makes the AI look good on past data, but it fails when new data arrives.

Why it matters

People thought AI bias came from the algorithms. But it turns out, humans cause the bias by telling the AI their goals. This means companies building AI tools must change how they design tasks and how they prompt the models.

The signal

Watch for AI developers to start designing systems that hide the ultimate goal from the AI, or to audit how humans interact with AI for hidden bias.