Blame the robots for copyright notice dysfunction

It turns out that robots aren't very good at identifying pirated porn, TV shows, music or anything else, and that could be a serious issue for Internet companies.

Nearly 30 percent of requests to remove copyrighted material are of questionable validity, according to a new set of studies by researchers at Berkeley Law and Columbia University. In millions of cases, the content targeted didn't even match the copyrighted work, the majority of which was music or adult entertainment.

Part of the problem are the automated systems that some of the biggest players use to catch misused material. Some of those "bots" aren't very discerning.

There's a big difference between holding the rights to a song by Usher and the movie "The House of Usher," or the HBO series "Girls" and Fox's "New Girl," or a "Lost" episode and "Extreme Makeover: Home Edition."

Yet, according to the researchers' detailed look at a database of more than 100 million requests, incorrect takedowns like those are frequently made and sometimes enforced. That's a big problem for the Digital Millennium Copyright Act procedures that are often considered the legal backbone of the Internet.

Passed by Congress in 1998, the DMCA gave the owners of copyrights and then-nascent online service providers the ability to resolve disputes without having to file expensive lawsuits every time.

But since then, the Internet has expanded, and large companies have turned to automated systems to issue millions of takedown notices. Google (which was founded in 1998) received fewer than 2,000 takedown requests for search results between 2001 and 2008. Today, Google Web Search reports 17 to 21 million requests every week. Others, like Twitter, have also seen massive growth.

Struggling startups

The cost of processing all those requests may be bearable for larger companies, but for smaller start-ups operating in hotly contested markets, attention from a rights enforcement group could be a death sentence. Many of those smaller service providers are still processing notices by hand.

"Startups may not get to scale up because the resources they need to put into enforcement are too much for them," said Jennifer Urban, one of the paper's three authors. "They're always under the fear of the Sword of Damocles of someone flooding them with notices they can't afford."

While a good share of the notices filed with Google search are targeting torrent and file-sharing sites, that's not necessarily the case with all online service providers. Another study by the same authors found that Google Image Search notices tend to be filed by individuals and target smaller outlets.

Those users can also contribute to the questionable notices in the system. In fact, the study found that more than half of the Google Image Search notices they examined were submitted by a single, motivated European woman, who seemed confused about what exactly DMCA notices are for.

"That's understandable," Urban said. "Copyright law is confusing and esoteric, and if you're a layperson you might mix it up with other issues like defamation or privacy issues."

Costs and solutions

The total cost of the dysfunction within the notice system is unknown; it's hard to calculate how many millions are spent on notices and how many smaller companies have been smothered. While some large companies voluntarily share details of their DMCA activity (like the database used by the researchers), most of the details remain in the dark.

"There is a lack of transparency in the system, which is based on private notices that go to private parties and are dealt with outside of the public dispute system," Urban said. "It's a black box, and that by itself will always raise questions about what the public effect of the system is."

Online service providers are balancing the real risk of billions in copyright liability with the possibility that a user may complain about having content removed (a counter-notice system is in place, but it's rarely used). Filers have little reason not to send notices for what could be fair use.

The solution, according to the paper's recommendations, is not to pile filtering requirements onto service providers, but to balance out the incentives that encourage excessive filing. Improving algorithms and bot parameters could also help clean up the system.

"It's a powerful remedy for a very cheap price — all you have to do is file a notice," said Urban. "You don't want to increase the cost so much that it's no longer useful."