New AI Agent Eliminates False Positives in Automated Pentesting

Security researchers have unveiled an automated verification system designed to solve the persistent problem of 'hallucinated' vulnerabilities in AI-driven penetration testing. The new tool, known as the Exploitation Verification Agent (EVA), acts as a secondary auditor that independently attempts to confirm any security flaws identified by primary testing agents.

AI agents are highly efficient at probing application surfaces, but they frequently report suspicious signals that lack substance. Common false positives include SQL injection alerts on parameterized endpoints, cross-site scripting (XSS) reports behind strict content policies, and server-side request forgery (SSRF) claims on servers that lack outbound connectivity. These errors often force human analysts to spend hours triaging fabricated findings.

Establishing a Proof-First Standard

Under the new architecture, every testing agent is paired with a dedicated EVA instance. Rather than replaying a recorded script, EVA functions as an intelligent agent that selects specific verification strategies based on the vulnerability class. If the system cannot replicate an exploit, the finding is discarded.

"We refuse to ship findings we cannot prove," the developers stated, characterizing the approach as an engineering constraint rather than a feature.

EVA categorizes results into three tiers: VERIFIED, POTENTIAL, and FALSE_POSITIVE. A finding is only marked as VERIFIED if the agent achieves end-to-end exploitation, such as successfully exfiltrating data or executing code in a browser. For browser-based XSS, the agent uses a headless Chromium browser via Playwright to confirm JavaScript execution, moving beyond simple string-matching techniques that often trigger false alarms.

For blind injection vulnerabilities, which are prone to false positives caused by network jitter, EVA employs statistical analysis. The agent establishes a baseline timing profile for the connection and compares it against the response time of the injected payload, ensuring that only statistically significant delays are flagged.

In cases where an initial verification attempt fails, the agent does not immediately label the finding a false positive. Instead, it initiates a retry protocol, cycling through various encodings and payload variants to account for input filtering. A finding is only removed if all attempts at reproduction fail.

When a flaw cannot be fully confirmed but displays strong indicators of risk, it is labeled as POTENTIAL. This classification includes documentation of the evidence gaps—such as a timing anomaly that failed to meet the statistical threshold—providing human analysts with a transparent view of why the system could not fully validate the threat. By forcing AI to prove its work, the developers aim to restore trust in automated security reporting.

New AI Agent Eliminates False Positives in Automated Pentesting

Establishing a Proof-First Standard

Comments

Keep reading

More from Cybersecurity

Latest news

New AI Agent Eliminates False Positives in Automated Pentesting

Establishing a Proof-First Standard

Keep reading

More from Cybersecurity

Kelp protocol plans migration to Chainlink following $292 million exploit

Anthropic CEO warns of cybersecurity risks as AI identifies new software vulnerabilities

Thousands infected in supply-chain attack using trojanized DAEMON Tools installers

Latest news

China mandates human oversight for agentic AI development

Residents Report Health Issues from AI Data Center Infrasound

Anthropic blames science fiction tropes for AI blackmail behavior