The world is being quietly rearranged by people who write very long documents.


The title they went with PRAISE: Prefix-Based Rollout Reuse in Agentic Search Training Noisy translates that to

AI researchers find a way to train search-based language models using 40% fewer examples


Researchers built a training method that reuses failed search attempts to teach AI systems how to answer complex questions better. Instead of throwing away bad attempts, the system extracts useful intermediate states from failed searches and uses them to generate new training examples with step-by-step rewards, cutting the data required to train these systems significantly.
Training these search-based systems is expensive because they require lots of long, failed attempts to learn anything useful. This method extracts more signal from the same amount of failed work, which means building better question-answering AI costs less and trains faster. The practical effect is tighter feedback loops during training — the system can learn which intermediate decisions were good or bad, not just whether the final answer was right.
Check whether commercial AI search products (like Perplexity or research assistants in Claude/GPT) get noticeably better at multi-step questions without requiring proportionally more computational training — that would signal this method is being deployed at scale.

If you insist
Read the original →