The world is being quietly rearranged by people who write very long documents.


The title they went with BioAlchemy: Distilling Biological Literature into Reasoning-Ready Reinforcement Learning Training Data Noisy translates that to

Biology AI trains on real research papers instead of textbook problems — performance jumps 9%


Researchers built a dataset of 345,000 question-and-answer pairs extracted directly from biology research papers, then used it to train a reasoning model. The model now performs 9% better on biology tasks than models trained on standard academic benchmarks that don't match what modern biology research actually looks like.
Biology reasoning models have lagged behind AI systems trained for math and coding because the training datasets don't reflect what biologists actually do — they're built from textbook problems, not real research. This signals that the bottleneck for AI reasoning in science isn't the model architecture; it's the training data. If this pattern holds across biology and scales to other sciences, it means future AI systems trained on real domain work will outperform general models, and the researchers or institutions that control access to good training data will control the capability frontier.
Watch whether the 345K dataset gets adopted by other teams building biology reasoning models, and whether similar dataset-extraction pipelines emerge for chemistry, materials science, or drug discovery — that's the signal this approach is becoming standard rather than a one-off.

If you insist
Read the original →