The world is being quietly rearranged by people who write very long documents.


The title they went with Emergent Inference-Time Semantic Contamination via In-Context Priming Noisy translates that to

Larger AI models can be poisoned through prompts alone — smaller models can't


Researchers showed that you can inject toxic biases into advanced language models just by feeding them a few examples at inference time, without retraining the model. This means applications that use prompts to customize AI behavior (a common practice) may inherit hidden contamination from those prompts, even when the prompt itself seems unrelated to the toxic content.
Until now, people thought prompt-based poisoning was harmless because earlier research said it didn't work. It turns out it does work, but only on more capable models that have richer learned associations between concepts. This is a security gap in a standard deployment pattern: if you build an AI application by feeding it examples to steer its behavior, you're assuming those examples stay compartmentalized. They don't, not in the larger models. The boundary is now measurable—you can test which model sizes are vulnerable—which means teams building production systems need to audit their own demonstration sets, not just their training data.
Watch whether major AI providers change their documentation or add guardrails around few-shot prompting, or whether deployment incidents surface where poisoned prompts leaked biases into downstream tasks.

If you insist
Read the original →