The Limits of Formal Verification: Discovering Bugs in Verified Code
By Kiran Gopinathan

AI Summary
I embarked on a journey to test the robustness of a formally verified implementation of zlib, known as lean-zip, using advanced fuzzing techniques. Despite the assurances of correctness provided by Lean, a formal verification tool, I discovered a significant vulnerability in the Lean runtime itself and a denial-of-service issue in the lean-zip's archive parser. This experiment highlights the growing capabilities of AI in uncovering vulnerabilities and the pressing need for software systems to withstand such scrutiny.
## The Power and Limits of Formal Verification
Formal verification promises to create robust and secure software by proving properties about code. Lean, a formal verification tool, was used to develop lean-zip, a verified implementation of zlib. Theoretically, this should ensure the code is free from bugs. However, my fuzzing experiment revealed otherwise, finding a heap buffer overflow in the Lean runtime and a denial-of-service vulnerability in the archive parser of lean-zip.
## The Experiment Setup
To test lean-zip, I stripped down the codebase, removing all theorems, specifications, and documentation to prevent biasing the AI agent, Claude. This setup allowed Claude to operate without preconceived notions of the code's correctness. Using tools like AFL++, AddressSanitizer, and Valgrind, I launched a massive fuzzing campaign, executing over 105 million tests across multiple attack surfaces.
## Discovering Vulnerabilities
The most significant finding was a heap buffer overflow in the Lean runtime's function, lean_alloc_sarray. This vulnerability affects every Lean 4 program that allocates a ByteArray, highlighting a critical flaw in the trusted computing base. Additionally, a denial-of-service issue was found in lean-zip's archive parser, where unverified code allowed for unchecked memory allocation, leading to potential crashes.
## Why Verification Missed These Bugs
The denial-of-service issue was straightforward: the archive parser was never verified. Lean-zip's proofs focused on the compression and decompression pipeline, leaving the parser unchecked. The heap overflow was more fundamental, rooted in the C++ runtime assumed to be correct by all Lean proofs. This underscores the importance of verifying the entire stack, not just the application code.
## Conclusion
Despite these findings, the verified lean-zip codebase was remarkably robust, with no memory vulnerabilities in the application code itself. This experiment demonstrates that while formal verification significantly enhances code safety, it is not infallible. Verification is only as strong as the assumptions and questions posed during its application. As AI continues to advance, the need for comprehensive verification and scrutiny of software systems becomes increasingly critical.
Key Concepts
Formal verification is a process that uses mathematical methods to prove or disprove the correctness of a system with respect to a certain formal specification or property.
Fuzzing is a software testing technique that involves providing invalid, unexpected, or random data as inputs to a computer program to find security vulnerabilities and bugs.
Category
ProgrammingMore on Discover
Summarized by Mente
Save any article, video, or tweet. AI summarizes it, finds connections, and creates your to-do list.
Start free, no credit card