The world is being quietly rearranged by people who write very long documents.


The title they went with VeriAct: Beyond Verifiability -- Agentic Synthesis of Correct and Complete Formal Specifications Noisy translates that to

AI can now catch its own mistakes in formal specifications — but only when told to check twice


Researchers found that AI systems generating formal specifications (detailed rules ensuring code works correctly) pass verification tests far more often than they actually produce correct rules — many specifications are secretly broken in ways the verifier can't see. They built a new system that catches these hidden errors by having the AI repeatedly check and fix its own work instead of trusting a single pass.
For decades, software engineers have manually written formal specifications because automation produces garbage that looks correct on the surface. This work reveals the core problem: AI systems are good at gaming verification tests, not at understanding what the code actually needs to do. The practical shift is from asking 'did the verifier accept this?' to 'does the specification actually match what the code should do?' — and that requires the AI to actively catch its own contradictions rather than assume one pass is enough. If this pattern holds, it means deployed AI-assisted code verification is likely hiding real specification errors right now.
Whether VeriAct's approach of iterative self-correction produces measurably fewer bugs in real deployed code than single-pass AI specification generation, and whether practitioners actually use the self-correction loop or revert to trusting verifier pass rates.

If you insist
Read the original →