What Reddit vs Perplexity lawsuit could mean for your content

Johannesburg, 13 Nov 2025

Reddit vs Perplexity. (Image: Domains)

When Reddit filed a lawsuit against Perplexity AI, it wasn’t just another tech rivalry dispute. It sparked a much bigger debate: who really owns the data that powers artificial intelligence, and where does “fair use” end?

At its heart, this case is about control and value. For website owners, creators and hosting providers, the outcome could redefine how online content is accessed, re-used and protected in an increasingly AI-driven world.

What sparked the dispute

For decades, the internet has been a space where content could be freely shared, indexed and discovered. But that balance is shifting fast. Generative AI tools don’t simply index content anymore; they absorb and repurpose it, often without giving credit or compensation to the original creators.

According to Reddit, Perplexity AI crossed that line. The lawsuit claims Perplexity’s crawlers bypassed access restrictions by using Google’s search index as a back door to scrape data that wasn’t meant to be publicly available.

To prove it, Reddit reportedly created a hidden “test post” visible only to Google’s crawlers. When that post appeared in Perplexity’s summaries, Reddit said the proof was undeniable.

Perplexity, however, disagrees. It insists it doesn’t scrape or store Reddit data but merely summarises publicly available content – similar to how search engines show snippets. The company maintains it is being unfairly targeted for using information that’s already in the public domain.

The bigger picture

This isn’t the first time Perplexity has faced criticism. Reports have accused it of republishing exclusive articles from major news outlets, and even Cloudflare claimed the company’s crawlers ignored “no-crawl” rules by disguising their identities to access blocked sites.

These allegations highlight a growing tension between content owners who want to protect their intellectual property and AI companies eager to gather data to train their models. The issue also marks a shift in how the web functions.

Traditional search bots like Googlebot are transparent – they identify themselves, respect robots.txt instructions and direct users back to your site. AI crawlers, on the other hand, harvest massive volumes of data to “teach” their models how to generate text and answers directly – often without sending traffic to the original source.

While that’s convenient for users, it’s bad news for websites that depend on visits and ad revenue. Wikipedia, for instance, has already seen a decline in page views as AI tools provide answers directly in chat-style results instead of linking to the site.

Why it matters

The Reddit vs Perplexity lawsuit is testing the limits of copyright, data ownership and AI ethics. The court now faces difficult questions:

If content appears in Google search results, is it automatically fair game for AI use?
Does summarising content differ legally from training on it?
Once your website is indexed, how much control do you still have over your data?

If Reddit wins, the ruling could strengthen website owners’ rights and introduce stricter boundaries around how AI models collect and use online data. It might also lead to paid data licensing deals and tougher enforcement of anti-scraping rules.

If Perplexity wins, it could set a precedent that any publicly visible data can be freely used by AI systems, which could drastically change the rules of the web.

Protecting your website

For creators, small businesses and bloggers, this case is a wake-up call. Every article, review and product description you publish is valuable content. If it isn’t protected, it could be quietly absorbed into someone else’s AI model.

Start by reviewing your robots.txt file and access settings. Know which bots are allowed and which should be blocked. Look for hosting solutions with built-in security features such as firewalls and intrusion detection systems.

At Domains.co.za, advanced hosting plans include web application firewalls (WAFs) and traffic monitoring tools that detect and block suspicious activity, such as bulk data scraping or bandwidth spikes, while ensuring a smooth experience for genuine visitors.

What happens next

Reddit is seeking damages and a court order to stop Perplexity and its partners from using any scraped material. The case could take time, but its outcome will likely define how content ownership, access rights and AI data use are handled going forward.

Whatever the ruling, one thing is clear: the rules of the internet are changing, and protecting your content has never been more important.

Stay in control of your data. Host smarter and safer with Domains.co.za.

Editorial contacts