The world is being quietly rearranged by people who write very long documents.


The title they went with Random Is Hard to Beat: Active Selection in online DPO with Modern LLMs Noisy translates that to

Smarter data selection for AI training doesn't beat random picking


Researchers tested whether carefully selected training examples improve large language models better than random examples. It turns out they don't — random samples work just as well, and the extra computation to pick smart examples isn't worth it.
Companies spend enormous resources trying to choose which human feedback to feed into AI training, assuming that better curation means better models. This paper shows that assumption breaks down once you have a large pool of examples to choose from. The practical implication is blunt: if you have enough data, the cheapest approach (random sampling) beats expensive selection methods. This matters because it flips the economics of model improvement — you don't need smarter selection, you need more data.
Watch whether companies actually shift away from active learning systems in favor of cheaper, larger random samples, or whether they keep paying for selection anyway despite the evidence.

If you insist
Read the original →