An automated security experiment using Claude AI agents has uncovered a heap buffer overflow within the Lean 4 runtime, exposing a vulnerability that affects all versions of the language to date.
Researcher Kiran Anand used the AI agent to conduct over 105 million fuzzing executions against `lean-zip`, a native implementation of zlib that was previously declared mathematically correct through formal verification.
While the verified `lean-zip` application code showed zero memory vulnerabilities, the experiment identified a critical flaw in the underlying `lean_alloc_sarray` function in the Lean runtime.
A failure in the runtime
The vulnerability occurs during the allocation of scalar arrays, such as `ByteArray`. When an allocation size approaches the maximum possible value for a 64-bit system, an integer overflow causes the runtime to allocate a tiny buffer while the system attempts to write a massive amount of data into it.
Anand stripped the `lean-zip` codebase of all theorems, specifications, and documentation before the test. This setup ensured the Claude agent operated without knowing the code was formally verified, preventing any algorithmic bias.
To find the bug, the AI agent launched 16 parallel fuzzers across six different attack surfaces, including ZIP extraction and gzip decompression. The agent utilized a suite of security tools including AddressSanitizer, Valgrind, and various static analysis engines.
In addition to the runtime overflow, the experiment identified a denial-of-service vulnerability in the `lean-zip` archive parser. This specific component was not covered by the formal verification process.
A bug report has been filed with the Lean developers, and a fix is currently pending in the Lean 4 repository.