Saturday, July 23, 2022

A Lawsuit Against the Internet Archive's Open Library Project

[The inspiration for this post came from listening to Boston Public Radio's Jim Braude and Margery Eagan interview of tech writer and blogger Andy Ihnatko on July 22, 2022 "focusing on the publishing industry’s lawsuit against the Internet Archive." The interview can be heard here -- scroll down the red and white "play" triangle labeled "LISTEN 18:59 Andy Ihnatko on BPR | July 22, 2022" 

In a way, it also a follow-up to this 2016 post on this blog, extolling the virtues of doing virtual research using the resources of the Internet Archive.]

In San Francisco there stands a bright, white, defunct Christian Science church. There are big white columns out front, with pink steps leading up to iron double doors. The church houses the gigantic internet project founded 20 years ago, the Internet Archive. 

                                          The Internet Archive, San Francisco.                                          Under CC License From Flickr User Beatrice Murch


“The idea was to try to build the library of Alexandria, version two,” explains Brewster Kahle, founder of the Internet Archive. “We bought this building because it matched our logo,” Kahle says.

The library of Alexandria, version one, was in Egypt. It was one of the biggest and most important libraries of the ancient world. It’s said to have housed every book or scroll ever written, until it was destroyed in a fire in the first century BC. 

At the time of this post, the Internet Archive is facing a different sort existential crisis. According to Tech journalist Viola Stefanello, 

In the early days of the pandemic, as physical libraries, schools, universities, and bookstores closed—and people were restricted from leaving their homes with very few exceptions—long waiting lists developed to access popular eBooks at public libraries.  

 To alleviate that problem, the Internet Archive launched a short-term project. Dubbed the National Emergency Library, it allowed anyone who signed up—for free—to their website to borrow digital copies of 1.4 million books in their possession without a waiting list. Most of these materials were 20th-century books that the Internet Archive had previously digitized to make up for the lack of commercially available eBook versions. 

The National Emergency Library was part of the Open Libraries initiative—a web-accessible public library containing the full texts of over 1.6 million public domain books as well as over 647,000 books not in the public domain.

In the Internet Archive’s announcement, published on March 24, 2020...Brewster Kahle said that allowing anyone to borrow these 1.4 million books without a waiting list in a time of crisis “was our dream for the original Internet coming to life: the Library at everyone’s fingertips”. 

Two years later, though, the Internet Archive’s dream is playing out as a legal nightmare.

Following public criticism from several writers, and accusations of “acting as a piracy site” by the Authors Guild, a group of major publishing houses sued the Internet Archive in summer 2020...

In this suit, the Internet Archive—represented by the Electronic Frontier Foundation (EFF)—argued that its Open Libraries initiative is basically equivalent to traditional library lending thanks to what is known as Controlled Digital Lending. According to this argument, the Internet Archive has been making digital copies of books that it physically owns, but only lending out the digital file to one user at a time, essentially replicating the experience of physical libraries only loaning a book to one person. 

At issue was the Internet Archive's decision to allow "students, academics, and everyone else to borrow up to five digitized books or eBooks for a two-week period" and "allowed people to access the same digital copy of a text at the same time." Publishers have long been upset that the Internet Archive digitizes the physical books in its collection and lends them out. According to an article by Aja Romano at Vox, the Internet Archive's "right to do so has been endorsed by many librarians and legal experts. But many critics of this approach, especially those within the publishing industry, have long argued that the IA’s Open Library is piracy because it distributes books as image files rather than appropriately licensing the works and compensating authors." 

The copyright infringement lawsuit was first filed on June 1, 2020, in the Southern District of New York, and is being coordinated by the Association of American Publishers. The AAP has compared the IA's scanning and lending efforts to those of the world's largest pirate sites. The plaintiff publishers are seeking damages for infringement as well as to shut down the IA’s scanning and lending program and to have any infringing scans destroyed.

According to the EFF’s Legal Director Corynne McSherry, the stakes far surpass the Internet Archive. “The publishers are not seeking protection from harm to their existing rights. They are seeking a new right foreign to American copyright law: the right to control how libraries may lend the books they own,” he stated. 

The Internet Archive maintains that its work does not actually harm writers or publishers. But because book publishers often lend e-books commercially (including to libraries), the Internet Archive could be seen as harming that aspect of publishers’ market, according to Joanne Gray, lecturer in Digital Cultures at The University of Sydney, University of Sydney, and Cheryl Foong, Senior Lecturer in Law, Curtin University. They also argue that

The flexibility of fair use is one thing the Internet Archive has on its side... There is room for the court to assess the public benefit of the Internet Archive’s lending practices which, as the National Emergency Library exemplifies, are undeniably strong. Assessing whether the public interest arguments are strong enough to overcome the weight of the market harm may be key to deciding who wins this case.

According to Publishers Weekly reporter Andrew Albanese, the plaintiffs (Hachette, Penguin Random House, HarperCollins, and Wiley) are asking for the Internet Archive to repay financial damages for 127 copyrighted titles present in the Open Libraries. According to one estimate, if the publishers win the maximum damages they could receive, the Internet Archive would owe $19 million dollars in damages, which is about one year of the Archive’s operating revenue. The plaintiffs also seek the halt to the copying books for loan in the Open Library Project. 

In response, attorneys for the Internet Archive told the court it is seeking monthly sales data for all books in print by the four plaintiffs dating back to 2011, data the publishers are loathe to comply with.

The Internet Archive's recent motion seeking summary judgment, presented in early July, reads.

“In a copyright lawsuit against a practice that has continued for years, one would expect the copyright holder to be able to point to some metric showing that the defendant’s conduct has harmed them. Plaintiffs have failed to quantify any market harm from CDL. And there’s a good reason: because the lending, licensing, and sales data demonstrate that no such harm has occurred or is likely to occur.”

Both parties’ request for the court to proceed with summary judgment has been granted.

“Beyond the monetary damages, the publishers are asking for the destruction of 1.4 million books, many of which do not exist in digital form anywhere else. That would be a real tragedy for people who depend on us for access to information,” Internet Archive founder Brewster Kahle told Vox in 2020. 

According to The Daily Dot's Viola Stefanello, 

"Considering that the print copies of these books are usually incredibly hard to access, let alone borrow, having a free, digitized copy at a click’s distance can speed up the research process significantly—and make knowledge more attainable for people who don’t work in academia."

We shall see how the judge rules in this case. A decision is expected sometime by the end of 2022 or beginning of 2023.

______________________________________________

Sources:

Albanese, Andrew, "Internet Archive Seeking 10 Years of Publisher Sales Data for Its Fair Use Defense" Publishers Weekly Aug 09, 2021 https://www.publishersweekly.com/pw/by-topic/industry-news/libraries/article/87104-internet-archive-seeking-10-years-of-publisher-sales-data-for-its-fair-use-defense.html#:~:text=In%20an%20August%209%20filing,Wiley)%20dating%20back%20to%202011

Gray, Joanne and Cheryl Foong, "Publishers vs the Internet Archive: why the world’s biggest online library is in court over digital book lending." theconversation.com, July 20, 2022. https://theconversation.com/publishers-vs-the-internet-archive-why-the-worlds-biggest-online-library-is-in-court-over-digital-book-lending-187166?utm_source=twitter&utm_medium=bylinetwitterbutton 

"In An Old Church, The Internet Archive Stores Our Digital History." KALW Public Media / 91.7 FM Bay Area; story originally aired in January of 2015. https://www.kalw.org/show/crosscurrents/2019-09-11/in-an-old-church-the-internet-archive-stores-our-digital-history

Romano, Aja "A lawsuit is threatening the Internet Archive — but it’s not as dire as you may have heard" Vox.com, June 23, 2020. https://www.vox.com/2020/6/23/21293875/internet-archive-website-lawsuit-open-library-wayback-machine-controversy-copyright

Stefanello, Viola "Inside the lawsuit that could upend the Internet Archive as you know it" https://www.dailydot.com/ July 13, 2022. https://www.dailydot.com/debug/internet-archive-lawsuit/


No comments:

Post a Comment