The world is being quietly rearranged by people who write very long documents.


The title they went with Empirical Validation of the Classification-Verification Dichotomy for AI Safety Gates Noisy translates that to

The standard way to keep AI safe fails as AI gets smarter. A new math method works.


Researchers tested common AI safety systems on AI that learns and improves itself. It turns out these systems cannot reliably keep the AI safe. This means the current approach to building safety into advanced AI is fundamentally broken, but a new mathematical method shows a way forward.
People assumed that AI safety could be handled by training another AI to classify bad behavior. This paper shows that approach is fundamentally flawed for AI that learns and changes itself. It means that as AI systems get smarter and more autonomous, the standard safety nets will fail. A different mathematical approach can offer provable guarantees.
Watch whether AI safety researchers and developers start adopting 'Lipschitz ball verifiers' instead of classifier-based safety gates in their next-generation systems.

If you insist
Read the original →