Anthropic inadvertently released the full source code for Claude Code through an npm package update on March 31, 2026. The exposed data remained visible long enough for security researchers to analyze the proprietary CLI tool before the package was withdrawn. This incident marks the second accidental exposure in one week, raising concerns about internal security protocols.
Key Findings
Researchers identified several hidden features within the leaked files, including anti-distillation mechanisms designed to protect training data. 1 method injects fake tool definitions into system prompts to poison copies made by competing models for training purposes. Analysts noted these protections are gated behind internal feature flags and primarily target first-party sessions.
A controversial feature called undercover mode instructs the AI to hide its identity as a language model in specific contexts. This mode prevents the system from mentioning internal codenames like Capybara or even the phrase Claude Code itself during development. The code indicates this setting can be forced on but has no mechanism to be forcibly disabled by users.
Security protocols for shell commands included 23 numbered checks designed to prevent injection attacks and command execution vulnerabilities. The analysis highlighted a frustration detection system that relies on regular expressions to identify user anger during interactions. This approach prioritizes speed and cost efficiency over more complex sentiment analysis models.
Security Implications
The leak also revealed cryptographic attestation methods used to verify the authenticity of client applications. Anthropic implemented a native HTTP stack hash replacement to ensure requests originate from the official binary rather than spoofed tools. This technical enforcement underpins recent legal threats sent to third-party developers like OpenCode.
Unreleased documentation for an autonomous agent mode named KAIROS appeared within the codebase. This feature includes nightly memory distillation and background daemon workers scheduled for five-minute refresh cycles. While heavily gated, the presence of this scaffolding suggests Anthropic is actively developing always-on agent capabilities.
The timing of the leak coincides with active legal disputes over API usage and authentication bypasses. Just 10 days prior, Anthropic issued legal threats requiring third-party tools to remove built-in authentication for Claude access. The exposed source code provides technical evidence supporting the company's claims regarding unauthorized API usage.
"The code comment references a server-side _parse_cc_header function that tolerates unknown extra fields," the analyst noted. This suggests validation might be more forgiving than expected for a system designed to prevent third-party access. Researchers found that setting environment variables could disable the attestation header entirely.
What Comes Next
Industry observers question whether the repeated leaks indicate internal negligence or potential insider activity. Security experts suggest the technical barriers are not impenetrable and that legal frameworks remain the primary defense against distillation. The incident highlights the growing tension between open source transparency and proprietary AI protection.
Future developments will likely involve stricter release pipelines and enhanced monitoring of build artifacts. The technology sector must balance the need for accessibility with the necessity of protecting intellectual property. This event serves as 1 case study for how AI tools manage their own distribution security.