Subscribe

Google's copyright dance, or SOPA's revenge?

Jon Tullett
By Jon Tullett, Editor: News analysis
Johannesburg, 17 Aug 2012

Google's decision to use copyright claims to demote search results has been criticised as a dangerous step down a slippery slope towards Web censorship. From some angles, it looks like Google's already well on its way down that slope. At best, it is Google's effort to minimise the necessary evil required to appease the copyright giants.

Last year, the US government tried and failed to pass SOPA (the Stop Online Piracy Act), which would have been a disaster of the first order for free speech online. Google, like many other organisations, strongly protested SOPA, going so far as to self-censor its homepage (other popular sites, like Wikipedia and Reddit, did the same) in an "Internet blackout".

SOPA was criticised for handing censorship tools to copyright owners, greatly extending the existing mechanism, the Digital Millennium Copyright Act (DMCA).

Although widely criticised, the DMCA is the now well-established US legislation currently governing most online copyright complaints, and it describes the process by which a piece of offending content should be flagged, how the infringing party can counter-claim, and offers safe harbour provisions for Web hosts so they aren't liable for third-party material - as long as they uphold takedown requests if they prove correct.

That last part is the challenge. If a Web host doesn't uphold a valid request, it can land itself in a world of legal hurt, so most providers err on the side of caution, yanking material, then (sometimes) allowing the poster to appeal for reinstatement. As a compromise, that's not ideal, but it can work at small scales. At large scales, though, it simply falls apart.

Google receives a huge number of copyright takedown notices. In the blog post announcing the changes, Google's Amit Singhal said: "We're now receiving and processing more copyright removal notices every day than we did in all of 2009 - more than 4.3 million URLs in the last 30 days alone."

When these complaints are upheld, it removes the offending material from its indexes. The volume of complaints (often a sign of a legal process being abused by strong incumbents - the patent system is a topic for another day) is so large that inevitably a number of mistakes occur, and pages and sites are incorrectly punished.

Knowing this, an abuser can rely on the flaws in the system to ensure the outcome he wants. There have been numerous complaints of DMCA takedown notices being used to delist articles unfairly - in one recent example, Universal Music used a DMCA notice to nix a negative review of an album. That could be an error, of course, but enough mistakes start to look a lot like a pattern.

Google, to its credit, puts a notice in search results where censorship has occurred, and links to the takedown notices hosted at www.chillingeffects.org, as well as aggregate data, including details of the most active complainants.

On my honour

The DMCA notes that copyright claims must be made in good faith under penalty of perjury. Have you ever heard of an organisation being held to account for filing a fake DMCA claim? Me neither, but perjury is a crime no less than copyright infringement. Many of those millions of copyright notices are filed automatically, by firms running agent software looking for possible infringements and firing off demands without confirmation that possible matches are correct. When it comes to identifying possible infringements in photographs or videos, less-than-stringent filters are applied to make matches: the margin for error has no penalty for the rights claimant, only a downside for the DMCA target.

There is no reason for a copyright organisation not to abuse the system to the maximum extent possible: the worst case scenario is you don't get what you want, in which case you re-file it another time without penalty (somewhat similar to the process of pushing unpopular laws like SOPA through Congress, in fact).

But Google's copyright travails run deeper than DMCA notices. One particular Google subsidiary is arguably more prominent in the copyright wrangle than any other: YouTube. And YouTube is not setting a good example.

Content ID vs DMCA

YouTube goes beyond the basic requirements of the DMCA and provides a tool called Content ID. The system allows rights owners to proactively upload material into a database, against which public videos are checked. Content ID then attempts to automatically detect infringing material, and also provides a facility for copyright holders to claim ownership of a video and either remove it entirely, or to claim the revenue from ad shown alongside the video.

In theory, Content ID is a good idea, but in practice it has a poor reputation, and is rapidly becoming a byword for abuse if not outright fraud. While the DMCA outlines a relatively simple process of claim and counter-claim, Content ID appears to act unilaterally, frequently making mistakes, and operating regardless of protests from legitimate rights owners seeing their work removed or revenue stolen.

Although users are supposed to be able to assert a defence against a Content ID flag, the complainant can simply reassert ownership until the video is taken down. Because Content ID is not a DMCA mechanism, no DMCA recourse is provided. Repeated complaints can result in the complete suspension of a YouTube account, with an even more opaque appeals process.

Recently, for example, a news organisation filed a Content ID demand to remove NASA's video of the Curiosity Mars rover landing. NASA places its videos in the public domain, so not only did Scripps (the news agency in question) not own the material, they could not.

And this wasn't the first time - not for NASA, and not for Scripps. NASA's Bob Jacobs told the Motherboard blog (which broke the NASA story): "We spend too much time going through the administrative process to clear videos slapped with needless copyright claims. YouTube seems to be missing a 'common sense' button to its processes, especially when it involves public domain material paid for by the American taxpayer.” NASA is a frequent target for this: AP was previously accused of similarly asserting ownership over NASA footage.

Earlier this year, music licensing firm Rumblefish asserted ownership over a nature video, based on the background sounds of birds singing. That incident started with an incorrect flag from Content ID - how it matched birdsong to a music track has never been established - which the video poster disputed. A Rumblefish rep then reasserted ownership, and YouTube diverted the revenue from ads on the video to Rumblefish. The incident went viral, and caught in a PR meltdown, the Rumblefish CEO apologised and withdrew the claim. It was not the first time Rumblefish made headlines for Content ID mistakes - it previously made headlines when it asserted ownership over public-domain backing music in videos.

Those instances may have been mistaken identity, but at the extreme end, the process is actively used for fraud. A now-defunct Russian company, Netcom Partners, made a practice of methodically claiming copyright over YouTube videos and diverting revenue from the legitimate owners. The Content ID process made that widespread 'privateering' all too easy.

Some of that widespread video hijacking appears to be algorithmic. Gamer.nl, a games review site, uses snippets of game trailers or game-play in its videos. These have resulted in unrelated videos being reassigned to Gamer.nl because of similar video segments. Gamer.nl has disclaimed any involvement, noting the process is automated. Content ID is nothing if not aggressive in matching clips, and critics complain that a claim should be acknowledge and asserted by an alleged rights owner before summarily handing over ownership.

Like the DMCA, there appears to be no penalty for claiming rights to something you don't really own. Google has consistently declined to comment on whether YouTube has ever taken action against an organisation caught filing bogus copyright claims.

Although Content ID is conceptually clever, its flawed implementation and easy abuse has made it a poster child for bad copyright behaviour. And, with its high profile, ambitious entertainment strategy, and relationships with influential rights holders, YouTube should represent Google's best efforts at resolving copyright issues fairly and in line with the law.

YouTube, in short, knows all about copyright claims. If Google will be using copyright claims to adjust search rankings, it will either have to do a lot better than YouTube, or risk polluting its search results.

How valid is valid?

The specifics of the announcement bear further scrutiny. Google says it will weigh search results based on "valid copyright claims". Unfortunately, we know that "valid" claims are frequently questionable. And Google has said it will exclude some popular sites, including YouTube, from penalty. That is a bitter pill for other services to swallow: the argument is that Google, knowing that its copyright processes are in a hopeless mess, is using that mess as an excuse to penalise competitors.

In YouTube's early days, the site gained traction due to illegal material - television episodes, music videos, and the like - over which it exerted no oversight. When the DMCA was passed, its safe harbour provisions continued to encourage that hands-off approach. With YouTube's rise to prominence, and its acquisition by Google, media owners have put legal pressure on the site to remove copyrighted material. Google has had to act firmly to keep the big players on side, but it risks doing so at the expense of the user community which drove it to succeed in the first place.

There's also a business risk here, for Google and for its customers. This is making YouTube a more risky place to host videos, for all but the dominant media conglomerates which are trying to control the content. There are other video hosting services, fully DMCA-compliant, but without the heavy-handed reputation of Content ID.

If YouTube is Google's best effort to manage copyright complaints, that best effort doesn't appear to be good enough. If Google thinks it has enough of a handle on copyright claims to use them to adjust its search rankings, the end result is unlikely to be positive for anyone other than the largest rights holders - the ones who already gain from systemic flaws in Google's copyright handling processes. Nor is this likely to be the last move in the game: no concession will ever suffice for the most aggressive copyright organisations. The Electronic Frontier Foundation has also noted the possibility of these provisions being used by governments to suppress free speech, without appeal or recourse.

This is not a risky first step giving ground to the demands of 'big copyright'; it's just the next step in a long, crooked walk. Under protest or not, Google may have just given us SOPA.

Share