AI models learn by pushing features apart, but only some of them show it

What happened

Researchers found that AI models learn by actively pushing similar features away from each other. This "feature repulsion" happens inside the model, but only some types of AI models show a clear, measurable change in their internal structure when it happens.

Why it matters

This paper looks at how AI models learn to generalize, a process called "grokking." It turns out that the way a model processes information internally, specifically how it handles similar data points, is key. This means that understanding how different activation functions affect this process could help build more reliable and predictable AI.

The signal

Watch for future research that links specific activation functions to the predictability and robustness of AI models in real-world applications.