Cloudflare accuses Perplexity of using stealth crawling techniques to evade network blocks
In a blog post, Cloudflare alleged that Perplexity initially uses a declared user agent for its bots, and later switches to undeclared, generic browser signatures and rotates IP addresses when it finds network blocks or robots.txt disallow directi...

In a blog post, Cloudflare alleged that Perplexity initially uses a declared user agent for its bots, and later switches to undeclared, generic browser signatures and rotates IP addresses when it finds network blocks or robots.txt disallow directives. The company added that this indicates the AI search company is evading detection and blocks put in place by website owners.
A robots.txt file is a text file that provides instructions to web robots, such as search engine crawlers, about which parts of a website they are allowed to access.
Perplexity responded by saying that Cloudflare’s leadership is "either dangerously misinformed on the basics of AI, or simply more flair than cloud". The company clarified in a post that its AI agents crawl websites differently.
"When you ask Perplexity a question that requires current information—say, "What are the latest reviews for that new restaurant?"—the AI doesn't already have that information sitting in a database somewhere. Instead, it goes to the relevant websites, reads the content, and brings back a summary tailored to your specific question. This is fundamentally different from traditional web crawling," the post read.
"These customers told us that Perplexity was still able to access their content even when they saw its bots successfully blocked. We confirmed that Perplexity’s crawlers were in fact being blocked on the specific pages in question, and then performed several targeted tests to confirm what exact behavior we could observe," the post read.
Cloudflare tested domain setups with strict robots.txt files and firewall rules barring both of Perplexity’s known crawlers. Perplexity was able to retrieve detailed information from these protected domains, proving the use of stealth crawling techniques and non-disclosed IP addresses.
Cloudflare noted in its post that Perplexity’s crawlers attempted to impersonate popular web browsers such as Google Chrome and accessed sites from multiple autonomous systems not associated with Perplexity’s official infrastructure.
To this, Perplexity said Cloudflare is mischaracterising the use of AI assistants as "malicious bots". "When Perplexity fetches a webpage, it's because you asked a specific question requiring current information. The content isn't stored for training—it's used immediately to answer your question," it said.
The Economic Times Business News App for the Latest News in Business, Sensex, Stock Market Updates & More.
The Economic Times News App for Quarterly Results, Latest News in ITR, Business, Share Market, Live Sensex News & More.