Reddit on Wednesday filed a lawsuit against Perplexity AI and three of its alleged data dealers for trafficking in unlawfully scraped information.

The complaint, filed in the Southern District of New York, claims that Oxylabs UAB, AWM Proxy, and SerpApi unlawfully bypassed Reddit’s and Google’s defenses to harvest Reddit content and related search results. It also says that Perplexity chose to purchase the purloined data rather than license it from Reddit.

Ben Lee, chief legal officer at Reddit, told The Register in an emailed statement that AI companies are desperate for quality content generated by real people and that need is fueling an industrial scale data laundering economy.

“Scrapers bypass technological protections to steal data, then sell it to clients hungry for training material,” said Lee. “Reddit is a prime target because it’s one of the largest and most dynamic collections of human conversation ever created.”

Lee claimed that Oxylabs UAB, a data scraping business based in Lithuania, AWM Proxy, a former Russian botnet, and SerpApi, which advertises real-time access to scraped Google search results, represent textbook examples of this sort of illegal behavior.

“Unable to scrape Reddit directly, they mask their identities, hide their locations, and disguise their web scrapers to steal Reddit content from Google Search,” said Lee. “Perplexity is a willing customer of at least one of these scrapers, choosing to buy stolen data rather than enter into a lawful agreement with Reddit itself.”

Reddit’s complaint likens these three providers to “would-be bank robbers, who, knowing they cannot get into the bank vault, break into the armored truck carrying the cash instead.” Echoing Cloudflare CEO Matthew Prince’s characterization of Perplexity, the Reddit legal filing describes Perplexity as “more akin to a ‘North Korean hacker'” who will do whatever is necessary to obtain the data to fuel its AI answer engine, other than pay for a license.

Google is not participating in the lawsuit but has tried to prevent automated scraping of its search results.

The social media contends that the defendants have violated the US Digital Millennium Copyright Act by bypassing its technological defenses against automated access to its servers. And it accuses SerpApi and Oxylabs specifically of violating the DMCA’s prohibition on trafficking in technology circumvention products or services. Other claims include unfair competition, unjust enrichment, and civil conspiracy.

Reddit is seeking an injunction to halt the unwanted scraping of its content and damages.

In June, Reddit filed a similar complaint against Anthropic after it failed to convince the AI business to enter into a content licensing deal as OpenAI has done

Oxylabs, which advertises itself as “the largest ethical proxy network and advanced scraping solutions em

 » …
Read More