Major news outlets block Wayback Machine to prevent AI training

Major news organizations are actively blocking the Internet Archive’s Wayback Machine to prevent their content from being used to train artificial intelligence models.

An analysis by Originality AI reveals that 23 major news sites now block 'ia_archiverbot,' the Wayback Machine’s web crawler. The crackdown includes high-profile outlets such as The New York Times and Reddit.

USA Today Co. has implemented blocks across its network of more than 200 media outlets. The Guardian is also restricting access, allowing crawling but filtering archived content from public view, which creates digital dead ends for researchers.

Publishers cite copyright and competition concerns

Publishers claim these measures are necessary to stop AI companies from scraping archives to build competing products. The New York Times stated that archived content is being used "to directly compete with us," though the company did not provide specific evidence of copyright violations.

USA Today Co. describes its actions as routine bot prevention. However, the move removes a primary tool used by journalists to verify historical accuracy and track editorial changes.

For three decades, the Wayback Machine has preserved over a trillion web pages. The current wave of blocking threatens the long-term accessibility of the public web as publishers prioritize protecting intellectual property from large language model developers.

Major news outlets block Wayback Machine to prevent AI training

Publishers cite copyright and competition concerns

Comments

Keep reading

More from AI

Microsoft patches 167 vulnerabilities including SharePoint zero-day and BlueHammer exploit

Police arrest man for firebomb attack on OpenAI CEO Sam Altman's home

Over 100 malicious Chrome extensions found stealing user data and hijacking accounts

Latest news

EFF warns California 3D printer bill risks consumer surveillance

OpenSSL releases version 4.0.0 with new post-quantum and ECH support

New JanaWare ransomware targets Turkish users with localized execution checks