Four of the nation’s major book publishers have sued the Internet Archive, the best-known online library for maintaining the Internet Wayback Machine. The Internet Archive makes digitized copies of books, both public domain and copyrighted, available to the public on a site called the Open Library.
“Despite the Open Library moniker, AI’s actions go far beyond legitimate library services, violate copyright law, and constitute deliberate industrial-scale digital piracy,” write publishers Hachette, HarperCollins , Wiley and Penguin Random House in their complaint. The lawsuit was filed in federal court in New York on Monday.
For nearly a decade, the Open Library has offered users the ability to “borrow” scans of copyrighted books via the Internet. Until recently, the service was based on a concept called “controlled digital lending” which mimicked the constraints of a conventional library. The library would only “lend out” as many digital copies of a book as it had physical copies in its warehouse. If all copies of a book were “verified” by other customers, you should join a waiting list.
In March, as the coronavirus pandemic escalated, the Internet Archive announced that it was forgoing this waitlist system. Under a program called the National Emergency Library, IA began allowing an unlimited number of people to view the same book at the same time, even if IA only owned one physical copy.
Prior to this change, publishers largely turned a blind eye as IA and a few other libraries experimented with the concept of digital lending. Some publisher groups have condemned the practice, but no one has sued it. Publishers may have feared setting an adverse precedent if the courts ruled that CDL was legal.
But the AI emergency loan program was harder for publishers to ignore. So this week, as a number of states lifted quarantine restrictions, publishers sued the Internet Archive.
In an email to Ars Technica, IA founder Brewster Kahle called the lawsuit “disappointing.”
“As a library, the Internet Archive acquires books and lends them out, as libraries always have,” he writes. “Publishers are suing libraries for loaning out books, in this case protected digitized versions, and while schools and libraries are closed, it’s in no one’s interest.”
“The publishers have a pretty solid case”
The publishers’ legal argument is simple: The Internet Archive makes and distributes copies of books without permission from the copyright holders. This is generally illegal unless a defendant can show that they are permitted by one of the various exceptions in copyright law.
Legal experts tell Ars that the Internet’s Archive’s best response is to argue that its program is fair use. It’s a flexible legal doctrine that has been used to justify a wide range of copying over the decades, from taping TV shows for personal use to quoting a few sentences from a book in a review. More relevant to our purposes, the courts have ruled it fair use to digitize books for limited purposes such as creating a book search engine.
When considering a fair dealing claim, courts consider several factors, including the impact of the dealing on the market for the original work. A book search engine, for example, does not replace reading books, but rather helps readers find new books they might want to buy. This is one of the reasons why the courts have found that scanning books for a search engine is legal under fair use.
But it’s harder to find convincing arguments that the Internet Archive’s open-ended loan program is fair use.
James Grimmelmann, a copyright scholar at Cornell University, told Ars he is withholding judgment until he sees the response from the Internet Archive. However, he said, “it looks like the publishers have a pretty strong case.”
“I think there are arguments for fair use, but they’re not very strong arguments,” he said in a phone interview on Monday.
A pandemic exception?
The Internet Archive would have had a stronger case if it had continued to limit the number of copies that could be loaned out. In this scenario, IA could argue that the program’s impact on the market was little different from that of a conventional library.
Clearly, a customer who borrows a book from a library is less likely to buy a copy, which undermines the market for the book. On the other hand, libraries themselves buy a lot of books – and the more popular a book, the more copies libraries have to buy. The overall impact of libraries on the demand for books is therefore unclear.
But once the AI stopped buying a copy of a book for every copy loaned out, that argument became much weaker. An institution like IA can buy a single copy of a book and then “lend” it to tens, hundreds or thousands of people at the same time. There is no doubt that this has a negative impact on the market for new books.
Instead, the Internet Archive will likely have to make a newer argument: that the unique circumstances of a pandemic justify allowing the kinds of offenses that would clearly be illegal at other times. Grimmelmann was unable to identify any other instances where courts have made this kind of leap.
I also spoke to John Bergmayer, a copyright expert with the copyright reform group Public Knowledge. He said there was a “pretty strong fair use case” for the Internet Archive’s previous controlled digital lending program and its new no-waitlist approach. Bergmayer pointed to the fact that millions of books are currently locked away in closed libraries due to the pandemic. This, he said, creates a unique situation that could justify digital lending activities that would otherwise be illegal.
But like Grimmelmann, Bergmayer could not cite specific court decisions that support IA’s aggressive interpretation of copyright law.
The stakes are high
While Grimmelmann was fairly optimistic about the publishers’ legal prospects, he disagreed with one aspect of the industry’s argument. The Internet Archive is officially a nonprofit, but the publishers’ lawsuit describes the group as a commercial operation profiting from copyright infringement. He points out that AI has earned millions of dollars through contracts to digitize books on behalf of partners such as other libraries.
But Grimmelmann told Ars that this fundamentally misunderstands the motivations of Brewster Kahle, the Internet Archive’s founder and still its driving force.
“Brewster Kahle is what Russians might call a saintly fool – someone who acts without much regard for himself or the things of the world in the service of a higher calling,” Grimmelmann said. The Internet Archive “is not a commercial enterprise,” he argued. Grimmelmann thinks Kahle, a 1990s dot-com entrepreneur who sunk millions into the Internet Archive, is fundamentally an idealist.
But Kahle’s idealism – or madness – could cost him dearly. Copyright law allows damages of up to $150,000 per work for willful infringement. And Grimmelmann tells Ars that if the publishers win the case, they will have a strong case that the infringement was deliberate.
The Internet Archive has digitized more than a million books that are still in copyright, so a loss could easily result in billions of dollars in damages, far beyond the nonprofit’s ability to pay. So if the publishers win the case, they could force the Internet Archive to shut down. It would be an incalculable loss given the group’s work in archiving other types of content, including the early days of the web.
However, publishers may not be interested in forcing the Internet Archive to shut down. Their goal is to get the Internet Archive to stop digitizing their books. If they win the lawsuit, they could force the group to shut down its book scanning operation and promise not to restart it, then allow it to pursue its other less controversial offerings.