What happened
Researchers built a method to teach smaller AI vision models to reason step-by-step through long documents by generating synthetic training examples that score pages for relevance and rank evidence. A 32-billion-parameter model trained this way now outperforms a 235-billion-parameter model on document understanding benchmarks, and produces 12 times fewer output tokens while doing it.
Why it matters
Document processing is expensive at scale — larger models cost more to run, and longer outputs mean slower, pricier inference. This work shows you can compress document reasoning into smaller, faster models without losing accuracy. That changes the economics of enterprise document systems: legal firms, insurance companies, and research organizations could now run document AI on cheaper hardware, process documents faster, and cut inference costs dramatically.