The world is being quietly rearranged by people who write very long documents.


The title they went with Efficient Continual Learning in Language Models via Thalamically Routed Cortical Columns Noisy translates that to

New architecture lets language models learn continuously without forgetting old skills


Researchers designed a new internal structure for large language models that allows them to learn from new data without erasing what they already know — a problem that has plagued these systems in practice. This matters because deployed AI systems need to improve over time as user behavior and tasks change, but current methods either lose old capabilities or require expensive external workarounds.
If this architecture works as described, it removes a fundamental trade-off: right now, teaching a language model new things almost always breaks its old skills, which is why deployed systems either stop learning or gradually degrade. A model that could continuously absorb new information without decay would stay useful longer and require fewer expensive retraining cycles.

If you insist
Read the original →