The world is being quietly rearranged by people who write very long documents.


The title they went with How iteration order influences convergence and stability in deep learning Noisy translates that to

Reordering gradient updates could cut neural network training time


Researchers found that the sequence in which a neural network processes batches of data during training affects how stable and fast it converges — and reversing the usual order can actually improve both. This matters because training large neural networks is expensive and unstable; even small improvements in stability or speed could reduce computational costs and make AI systems more reliable to train.
For decades, neural network training has followed a fixed procedural order because no one had reason to think order mattered; this paper shows the order itself is a lever you can pull to improve training without expensive learning rate tuning, which means the same training could happen faster or with smaller machines.

If you insist
Read the original →