The world is being quietly rearranged by people who write very long documents.


The title they went with Feature Repulsion and Spectral Lock-in: An Empirical Study of Two-Layer Network Grokking Noisy translates that to

AI models learn by pushing features apart, but only some of them show it


Researchers found that AI models learn by actively pushing similar features away from each other. This "feature repulsion" happens inside the model, but only some types of AI models show a clear, measurable change in their internal structure when it happens.
This paper looks at how AI models learn to generalize, a process called "grokking." It turns out that the way a model processes information internally, specifically how it handles similar data points, is key. This means that understanding how different activation functions affect this process could help build more reliable and predictable AI.
Watch for future research that links specific activation functions to the predictability and robustness of AI models in real-world applications.

If you insist
Read the original →