The world is being quietly rearranged by people who write very long documents.


The title they went with Instructing LLMs to Negotiate using Reinforcement Learning with Verifiable Rewards Noisy translates that to

Small AI agents can now out-negotiate larger models by learning from verifiable outcomes


Researchers taught a small AI agent to negotiate prices by rewarding it for maximizing economic gain and sticking to budgets. This training method allowed the smaller agent to consistently beat much larger, more advanced AI models in negotiations.
For years, the assumption has been that bigger AI models are always better, especially for complex tasks like negotiation. This paper shows that targeted training with clear, verifiable goals can make smaller, cheaper models more effective than their massive counterparts. This means companies might not need to rely on the most expensive, largest AI models for specific business tasks, potentially lowering the cost of deploying advanced AI.
Watch for companies to start deploying smaller, specialized AI models for tasks like procurement or sales, rather than defaulting to general-purpose large language models.

If you insist
Read the original →