The world is being quietly rearranged by people who write very long documents.


The title they went with Scalable Pretraining of Large Mixture of Experts Language Models on Aurora Super Computer Noisy translates that to

AI models can now train on thousands of computer chips simultaneously


Researchers have developed a new system that allows artificial intelligence models to train using thousands of specialized computer chips at once. This means AI can be built faster and potentially at a lower cost, making it easier to create more powerful AI systems.
Building large AI models requires immense computing power. This work demonstrates a way to use a supercomputer with over 100,000 processing units to train these models efficiently. It shows that scaling up AI training across thousands of chips is possible with good software and hardware coordination. This could accelerate the development of more capable AI, potentially lowering the cost of training future models.
Watch whether similar training methods are adopted by other major AI labs and if the cost of training the largest AI models begins to decrease measurably in the next two years.

If you insist
Read the original →