The world is being quietly rearranged by people who write very long documents.


The title they went with Haiku to Opus in Just 10 bits: LLMs Unlock Massive Compression Gains Noisy translates that to

A small AI model can recover 70% of a large model's ability by asking 10 yes-or-no questions


Researchers found that a weak AI model can extract most of what a stronger model knows through an interactive question-and-answer process, using 100 times less data than previous methods. This means smaller models could get smarter by talking to bigger ones instead of needing to process entire responses — useful for phones, robots, or anywhere bandwidth or computing power is tight.
For years, the only way to move knowledge from a large model to a small one was to have the large model generate full text, then compress it. This paper shows that a small model asking binary questions (yes or no, one bit per answer) transfers knowledge far more efficiently. The practical consequence: you could run a capable AI on a phone or embedded device by letting it query a remote model through a thin pipe of binary questions, rather than downloading massive responses or pre-training on huge datasets. This breaks the constraint that capability requires local compute.
The question is whether anyone actually deploys this in production — whether a phone app, robot controller, or edge device starts using interactive question-asking to access remote models, and whether it beats the latency and bandwidth of current compression methods.

If you insist
Read the original →