[Openid-aiim] what makes us think that the AI can be trusted to tell us the truth?
Tom Jones
thomasclinganjones at gmail.com
Tue Aug 5 20:10:52 UTC 2025
Cloudflare Exposes Perplexity for Impersonating Google to Scrape Data
*Cloudflare vs. Perplexity: A Data Ethics Clash*
This latest revelation raises a red flag in the ongoing debate about how AI
companies source their data. According to a LinkedIn post shared by a
trusted former U.S. government technologist, Cloudflare discovered that the
AI startup Perplexity was disguising its bot traffic as Google Chrome to
circumvent site restrictions—essentially scraping data in ways that many
see as deceptive.
Key Allegations
-
*Browser Spoofing*: Perplexity’s systems allegedly mimicked Chrome’s
browser identity to bypass blocks on web crawlers.
-
*Caught in a “Data Trap”*: Cloudflare set up infrastructure specifically
to detect unauthorized scraping, which flagged Perplexity’s activity.
-
*Comparisons to Bad Actors*: Cloudflare CEO likened the behavior to
tactics used by North Korean hackers—strong language that signals deep
concern.
Broader Implications
-
*Tensions Rise*: The incident spotlights friction between AI startups
hungry for training data and content providers aiming to protect their
intellectual property.
-
*Ethical Reckoning*: As AI tools increasingly rely on scraped web
content, many creators are demanding both transparency and compensation.
-
*Technical Countermeasures*: Cloudflare is now offering enhanced tools
for sites to block unwanted AI crawlers, an attempt to rebalance digital
power dynamics.
This isn't just a story about one company stepping over the line—it’s a
signal flare in the battle for ethical AI development. Where do we draw the
boundary between open data and exploitation?
Peace ..tom jones
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openid.net/pipermail/openid-aiim/attachments/20250805/0696903b/attachment.htm>
More information about the Openid-aiim
mailing list