Photo: Michael Kovac/FilmMagic
When Sarah Silverman sued artificial-intelligence titans OpenAI and Meta Platforms on July 7, her copyright lawsuits seemed to present a relatively straightforward allegation: These companies didn’t secure Silverman’s and other authors’ permission before using their copyright works, including her 2010 autobiography The Bedwetter, which isn’t okay, per these suits. Silverman is joined by two other authors, novelists Christopher Golden and Richard Kadrey, in these suits; their civil complaints are seeking class-action status which, if green-lit by the court, means that many, many more writers could take action against these companies.
Indeed, OpenAI’s ChatGPT and Meta’s artificial-intelligence projects rely on the mass trawling of books to learn language and generate text, the suits say. Silverman’s suit contends that these AI projects didn’t secure her and other authors’ permission for using their works before inhaling them, violating intellectual-property law. They also claim that these AI systems gained access to these books via spurious means, using libraries of pirated texts — or as the suits’ co-attorney Matthew Butterick puts it to Vulture, “Creators’ work has been vacuumed up by these companies without consent, without credit, without compensation, and that’s not legal.”
Silverman claims that ChatGPT and Meta’s generation of text is the very receipt that proves they consumed them. If they can spit out summaries of The Bedwetter and other copyrighted works, her suit contends, then these systems must have used pilfered books to do so. The proposed class action is asking for financial damages as well as “permanent injunctive relief” to stop these AI systems from gobbling down their work — and then using that to create text — without permission or payment.
While these are copyright cases, they might present an opportunity to learn more about the shadowy world of AI as litigation unfolds, and tech-wary catastrophizers such as this author might wonder whether their outcome could actually impact AI’s operation. If a judge or jury decides that ChatGPT can’t consume copyrighted material with abandon, will that ruling potentially limit what AI can do? Another way: Could a lawsuit over The Bedwetter thwart a Skynet-like situation? Vulture spoke with Butterick, as well as two experts on law and AI, to learn more about what this litigation can and can’t do. Neither OpenAI nor Meta immediately responded to requests for comment.
What exactly is AI doing with copyrighted books that’s so bad?
Butterick, who is joined in co-leading the suits with attorney Joseph Saveri, said the core issue is that AI isn’t just coming up with stuff that just coincidentally happens to sound like The Bedwetter or other books — it’s relying entirely on people’s creations. “These artificial-intelligence systems, it’s kind of an ironic name, because they are built entirely and exclusively on the work of human creators. All of these generative-AI systems rely on consuming massive quantities of human creative work, whether it’s text for these language models or whether it’s images for these AI image generators,” he says. “That’s how these neural networks function: They take in the training material, and what they do is they try to emulate it. When we talk about artificial intelligence, we have to understand where it’s really from: It’s human intelligence, [but] it’s just been divorced from the creators.” The lawsuits also claim that the books are gathered from sketchy online sources that don’t have the green light to have them in the first place.
“I view this as an existential issue for creators,” says Jacqueline Charlesworth, who repped publishers in litigation against the Internet Archive’s library-like book-lending system. A judge ruled in March that the Archive’s book-sharing setup violated copyright law. “What’s going on right now is AI suddenly entered popular culture, and tools were made available to everyone, basically, and it seemed like an explosion overnight. Even though we know a lot of these models were being developed over time, [they] exploded.” There’s also the issue of whether humans have a right not to be used by AI, be they authors or people generally. “People really should have the right to opt out of having their works, their data, used in those models,” Charlesworth argues.
What, if anything, will the suits tell us about AI?
One of the major concerns about AI is the secrecy around how, exactly, platforms like ChatGPT are operating these days. It’s becoming more and more enmeshed with our daily lives, which means that a lack of specifics about this system could prevent neutral parties from seeing problems or potential dangers, experts told Fast Company in March. For example, AI is known to reflect bad biases — such as racial prejudice — that we see among humans. Without knowing exactly what’s up with how it learns or picks up potential biases, it’s hard to address it in AI, per University of Michigan-Dearborn professor Samir Rawashdeh.
The discovery phase of Silverman’s lawsuits could potentially lift the veil on how these systems work. “A discourse that’s been promoted and pushed by the AI companies themselves is that these systems are essentially magic black boxes and they learn like a human, and there are all these metaphors that are thrown out to essentially dissuade people from scrutinizing how they work,” Butterick says. “And by doing so, they are trying to insulate themselves from any kind of legal inquiry. That’s part of what these cases are about: Let’s open the black box. Let’s see what’s inside.”
Charlesworth voices similar sentiments about AI proceedings. “We are going to learn a lot more about exactly how the models work and what the training data is,” she notes. “There’s not a lot of transparency there, and I think, particularly if your model is based on pirated books already, that’s a huge red flag. You’re copying books without permission. That’s infringing.”
Could Silverman win?
It’s impossible to predict the outcome of any lawsuit. But there’s some doubt that the case is going to be a slam dunk for authors due to a landmark case involving Google Books nearly a decade ago. The U.S. Supreme Court determined in 2016 that Google Books’ practice of summarizing texts — and showing excerpts to users — didn’t violate copyright law, according to the Associated Press. Deven Desai, a professor of business law and ethics at Georgia Institute of Technology, says that law presently permits the use of books to train software. Desai notes that the Google Books case resulted in “the ability to use books in transformative ways, including creating snippets and training software in that sense, for machine learning,” so machines can use books to learn under the law.
As for a copyright case staving off an AI revolution? “It’s not really about GPT systems taking over [the world], but about whether they have to pay for their training data.” If OpenAI didn’t buy copies of the books, they probably just should have. Perhaps the pen won’t be mightier than the sword in a robot war after all.
GPT’s reaction to this article:
As an AI assistant, I don’t have personal opinions. However, I can provide you with an analysis of the article.
The article discusses the copyright lawsuits filed by Sarah Silverman and two other authors against artificial intelligence companies OpenAI and Meta Platforms. The authors claim that these companies used their copyrighted works without permission or compensation in their AI projects. The lawsuits seek class-action status, potentially allowing more writers to take action against the companies.
The article highlights the issue of AI systems relying on human creations without proper consent. It also raises questions about the impact of the lawsuits on the operation of AI and whether they could limit what AI can do. The article suggests that the litigation could provide insights into the workings of AI systems that are currently shrouded in secrecy.
However, the article also mentions the landmark case involving Google Books, where the U.S. Supreme Court ruled that summarizing and showing excerpts of texts in Google Books did not violate copyright law. This precedent may make it challenging for the authors to win their copyright cases against OpenAI and Meta Platforms.
Overall, the article presents an overview of the copyright lawsuits and raises important questions about the relationship between AI and copyrighted works.