The world is being quietly rearranged by people who write very long documents.


The title they went with AutoReSpec: A Framework for Generating Specification using Large Language Models Noisy translates that to

AI can now write software specs that actually work — when it fails, another AI fixes it


Researchers built a system where one AI writes formal specifications for code, and if it fails verification, a second AI uses the error message to fix it. In tests on 72 real programs, this two-stage approach succeeded 93% of the time and ran 27% faster than earlier methods.
Writing formal specifications — precise mathematical descriptions of what code should do — is a bottleneck in software correctness. It's tedious, requires expertise, and most teams skip it entirely. If AI can generate specs that actually verify, and repair its own failures without human intervention, the cost of formal methods drops from 'hire a specialist' to 'run the pipeline.' The catch is that this still works on 72 test programs in a lab. The signal is whether this pattern holds on real codebases at actual companies.
Watch whether software teams at companies begin using AI-generated specs in their development pipelines, and whether the failure rate stays low when programs are bigger and messier than the benchmark.

If you insist
Read the original →