Cloudflare accuses Perplexity of using stealth crawling techniques to evade network blocks

In a blog post, Cloudflare alleged that Perplexity initially uses a declared user agent for its bots, and later switches to undeclared, generic browser signatures and rotates IP addresses when it finds network blocks or robots.txt disallow directi...

By ETtech | Aug 05, 2025, 10.55 AM IST

Cybersecurity company Cloudflare has accused artificial intelligence (AI) company Perplexity, led by IIT Madras alumnus Arvind Srinivas, of using stealth techniques to bypass website restrictions and continue crawling content despite directives against such activity.

In a blog post, Cloudflare alleged that Perplexity initially uses a declared user agent for its bots, and later switches to undeclared, generic browser signatures and rotates IP addresses when it finds network blocks or robots.txt disallow directives. The company added that this indicates the AI search company is evading detection and blocks put in place by website owners.

A robots.txt file is a text file that provides instructions to web robots, such as search engine crawlers, about which parts of a website they are allowed to access.

Perplexity responded by saying that Cloudflare’s leadership is "either dangerously misinformed on the basics of AI, or simply more flair than cloud". The company clarified in a post that its AI agents crawl websites differently.

"When you ask Perplexity a question that requires current information—say, "What are the latest reviews for that new restaurant?"—the AI doesn't already have that information sitting in a database somewhere. Instead, it goes to the relevant websites, reads the content, and brings back a summary tailored to your specific question. This is fundamentally different from traditional web crawling," the post read.

— perplexity_ai (@perplexity_ai)

Cloudflare’s findings came after complaints from customers who noticed Perplexity accessing content that was explicitly off-limits to bots.

"These customers told us that Perplexity was still able to access their content even when they saw its bots successfully blocked. We confirmed that Perplexity’s crawlers were in fact being blocked on the specific pages in question, and then performed several targeted tests to confirm what exact behavior we could observe," the post read.

Cloudflare tested domain setups with strict robots.txt files and firewall rules barring both of Perplexity’s known crawlers. Perplexity was able to retrieve detailed information from these protected domains, proving the use of stealth crawling techniques and non-disclosed IP addresses.

Cloudflare noted in its post that Perplexity’s crawlers attempted to impersonate popular web browsers such as Google Chrome and accessed sites from multiple autonomous systems not associated with Perplexity’s official infrastructure.

Cloudflare said this behaviour was not found in bots from other operators such as OpenAI, which stop crawling when not allowed and are transparent about their activities. As a result of this, Cloudflare has de-listed Perplexity as a verified bot and updated its rules to block such stealth activity.

To this, Perplexity said Cloudflare is mischaracterising the use of AI assistants as "malicious bots". "When Perplexity fetches a webpage, it's because you asked a specific question requiring current information. The content isn't stored for training—it's used immediately to answer your question," it said.

The company questioned Cloudflare for arguing that any automated tool serving users should be suspect, something that would criminalise email clients and web browsers, "or any other service a would-be gatekeeper decided they don’t like".

Download
The Economic Times Business News App for the Latest News in Business, Sensex, Stock Market Updates & More.

Cloudflare accuses Perplexity of using stealth crawling techniques to evade network blocks

In a blog post, Cloudflare alleged that Perplexity initially uses a declared user agent for its bots, and later switches to undeclared, generic browser signatures and rotates IP addresses when it finds network blocks or robots.txt disallow directi...

Related Articles

READ MORE:

More from our Partners

Popular Categories

Hot on Web

In Case you missed it

Top Searched Companies

Latest News

Download ET APP

Follow us on

become a member