Reordering gradient updates could cut neural network training time

What happened

Researchers found that the sequence in which a neural network processes batches of data during training affects how stable and fast it converges — and reversing the usual order can actually improve both. This matters because training large neural networks is expensive and unstable; even small improvements in stability or speed could reduce computational costs and make AI systems more reliable to train.

Why it matters

For decades, neural network training has followed a fixed procedural order because no one had reason to think order mattered; this paper shows the order itself is a lever you can pull to improve training without expensive learning rate tuning, which means the same training could happen faster or with smaller machines.