Since 2020, there have been nearly 40 copyright lawsuits filed against AI companies in the US. In this intensifying battle over AI-generated content, creators, AI companies, and policymakers are each pushing competing narratives. These arguments, however, tend to get so impassioned that they obscure three crucial questions that should be addressed separately — yet they rarely are.
First, how does existing copyright law apply to AI? Most existing statutes do not explicitly mention AI. Some legal experts, however, argue that courts can adapt traditional frameworks through judicial interpretation. Others contend that copyright’s human-centered assumptions make such adaptation impossible.
Second, where current law proves inadequate, how should the original purpose of copyright law guide new solutions? Copyright was conceived by the Founders to “promote the Progress of Science and useful Arts,” by providing creators with limited monopolies over their work. In the AI era, multiple stakeholders have legitimate claims: creators whose works train AI systems, developers building AI technology, and the general public, who benefit from progress (both AI- and human-driven).
Third, how should broader societal concerns, like maintaining American technological competitiveness or mitigating labor-market disruption, influence our approach to copyright in the AI era, despite being tangential to copyright's original purpose?
Essentially, the legal battles highlight two areas of disagreement: whether companies can use copyrighted works as data to train their models, and who has rights to the content these models generate.
On the second question (regarding content generated by AI), the legal issues seem to fit into existing frameworks. When AI reproduces copyrighted material, traditional infringement standards apply. The basic principle is that if someone without permission uses AI to generate images, video, text, or other content that is substantially similar to existing copyrighted work, that may constitute infringement.
The issues around the first question (regarding the use of copyrighted works to train models) are murkier — and far more consequential. Currently, AI companies obtain training data through a variety of methods (scraping the web, using public datasets, entering licensing agreements, and etc.). This approach reflects an assumption that such training does not violate copyright.
AI models ingest vast quantities of text, video, imagery, and other content types to extract patterns. The resulting model doesn’t store the original works or memorize them in a literal sense; instead, it learns by generalizing across the training data to build a probabilistic representation of language and meaning. Using this learned structure, the model can generate new content that bears no readily observable resemblance to any individual input — yet in the marketplace, the new content can effectively substitute for the copyrighted material on which it was trained.
This poses a unique challenge to the traditional copyright framework: AI-generated content can compete with original works without directly reproducing them. Yet, under copyright law, market competition isn’t an infringement. If another person reads your novel, learns a couple of tricks from it, and writes a better one that outsells yours, that’s not illegal; that’s life.
Plaintiffs in several of the AI lawsuits, including The New York Times Company v. Microsoft et al (including OpenAI), argue that there is a key technical difference between human and machine learning: In order to train AI models, data must be processed in a way that involves making digital copies, even if they are short-lived and never seen by humans. This copying, they argue, is enough to trigger copyright infringement.
From the AI companies’ perspective, the main defense is the “fair use” doctrine, which allows third parties to use copyrighted materials without permission in certain circumstances. For instance, a book reviewer who reprints sections of a properly attributed copyrighted work in their critique is protected by fair use. Similarly, some educational and research applications of copyrighted works may qualify for fair use, particularly when they transform or comment upon the original work.
Want to contribute to the conversation? Pitch your piece.
AI companies argue that their models don't consume works for their creative expression. Instead, they convert them into mathematical representations of patterns, changing the works’ purpose and character. They point to precedents like Authors Guild v. Google, where federal appellate judges determined that Google Books’ practice of copying entire books to create a searchable database was transformative and, therefore, fair use. In other words, the court determined that the primary purpose of Google Books was to create new tools for research and creativity, not to reproduce and profit from the original works.
As nuanced as these arguments can be, there’s a question of whether legal frameworks designed for human creativity can (or should) govern systems that operate at superhuman scale and speed. Yes, humans read and create based on what they learn. But if a human could read a thousand books in a day and write a hundred bestsellers by nightfall, we might also want to rethink how copyright applies to them.
Technology has introduced disruption before: Photocopy machines, tape recorders, and the internet have each posed challenges to past understandings of copyright law.
What makes AI unique is the scope and speed at which it is changing the economy. While earlier technological shifts played out over decades, giving industries time to adapt, AI’s capabilities have exploded in just a few years. And it is affecting multiple industries at once. Given these points, legal scholars are debating whether AI is exceptional enough to warrant revising copyright laws.
Some experts worry that hastily passing new legislation could cause more harm than good. In a panel discussion on AI and intellectual property at the London School of Economics, Associate Professor of Law Luke McDonagh said, “Don't rush into anything. The law is a very blunt tool, so if we get this wrong, it could have lots of different downstream effects that we don't appreciate.” Conversely, case-by-case decisions can clarify the law over time without prematurely constraining innovation.
This caution appears justified, given that many legal scholars believe that copyright law is sufficient to handle AI’s challenge to intellectual property, and that, while past technologies have prompted courts to reinterpret fair use doctrines, none have rendered the pillars of the law obsolete. In an interview with the Harvard Gazette, Rebecca Tushnet (Frank Stanton Professor of First Amendment Law at Harvard Law School) argued: “The law, especially fair use, was designed to be flexible and to handle new situations. And it’s done that quite well.” Earlier this year, the US Copyright Office issued a report effectively stating the same: “Questions of copyrightability and AI can be resolved pursuant to existing law, without the need for legislative change.”
The proliferation of AI copyright lawsuits reveals deeper tensions. Legal scholars argue that copyright law risks being stretched beyond its intended purpose to address broader concerns about labor displacement, market power, and social disruption.
James Grimmelmann, professor of digital and information law at Cornell University, has argued: "We shouldn't be using copyright law as labor policy to figure out the role of humans in a world of automation. We shouldn't be using copyright law to protect privacy or to protect against dangerous content. Copyright was not built for that.” Similarly, Pamela Samuelson (Richard M. Sherman Distinguished Professor of Law and Information at the University of California, Berkeley) notes: “I just don't like copyright law being used for non-copyright purposes. What I see in some of these lawsuits is a desire to destroy these models.”
So, what is the purpose of copyright? From the beginning, US law has framed copyright as a tool to encourage the creation and dissemination of knowledge and culture. The Constitution states that Congress has the power to “promote the Progress of Science and useful Arts, by securing for limited Times to Authors and Inventors the exclusive Right to their respective Writings and Discoveries.” The logic is: If you give creators temporary exclusive rights from which they may profit, they will be more incentivized to create, and society will get more books, music, and inventions in return.
In practice, however, the way copyright works has always been more complicated. According to the Electronic Frontier Foundation, the application of copyright law today mostly serves large media conglomerates’ interests on a scale that vastly exceeds any protections or profits for individual artists. The EFF argues that copyright is extended longer and enforced more aggressively than ever, yet creative workers often see little of the profit their work generates. For some critics, this represents a fundamental departure from copyright's original mission — a dynamic that some worry may worsen if overly restrictive interpretations stifle technological innovation.
Given this gap between copyright’s constitutional purpose and its current reality, some legal scholars believe that copyright’s fundamental premises are ill-suited for the AI era. Kevin Frazier, AI Innovation & Law Fellow at The University of Texas School of Law, recently argued that current U.S. copyright laws may violate the Constitution’s Intellectual Property Clause by impeding, rather than promoting, the spread of knowledge. He warns this trend could worsen as restrictive copyright interpretations “strangle [AI’s] potential in its cradle” by limiting access to the training data needed to build systems that could democratize and accelerate knowledge dissemination. He calls for reforms that would realign copyright with its original purpose, such as expanded fair use and statutory licensing, to ensure this constitutional mandate is fulfilled.
/odw-inline-subscribe-cta
Others go further. Mark Lemley, director of Stanford University’s Program in Law, Science, and Technology, suggested in the Science & Technology Law Review that, in a world of generative AI, we may no longer need copyright at all. “We need copyright only if we think we won’t get enough creation without it,” he wrote. “That may no longer be a worry in the world of generative AI.”
Whether applied through reform or abolition, such fundamental changes to copyright law could have geopolitical effects, according to some defendants in the AI copyright cases. In a statement, OpenAI said that restrictive laws would hamper AI innovation and effectively hand China control over the technology’s future. Google has similarly expressed the view that data mining should be protected by fair use, lest the U.S. would lose its innovative edge.
Few of the AI copyright cases have already reached a ruling; but, of those that have, the most significant thus far has been Thompson-Reuters v. ROSS Intelligence. In February 2025, a judge found that the defendant (an AI-powered legal research tool) was guilty of violating fair use by outputting documents that featured the plaintiff’s IP. The decision — which is currently on appeal — signals that courts may be skeptical of fair-use defenses in cases where the AI product directly competes with the original material on which it trained. However, because this case was focused on an AI search tool, it may not provide much precedential value for cases weighing how fair use applies to generative AI.
In the music world, two cases filed by the Recording Industry Association of America (UMG et al. v. Suno and UMG et al v. Uncharted Labs) take aim at AI tools that can mimic melodies, styles, and even voices of famous performers. In those cases, the court must decide whether an AI-generated song “in the style of” Marvin Gaye or Taylor Swift is a creative homage or a market replacement. A decision in favor of the labels could cement the idea that not only works but also vocal likenesses and stylistic signatures are protected assets in the age of generative AI. A ruling the other way may signal to startups that style imitation is legally fair game, opening the door to soundalike content built on unlicensed training data.
Training data is at the center of both The New York Times v. Microsoft et al and Authors Guild et al v. OpenAI et al, where judges are weighing whether training a model on millions of copyrighted articles and books without permission counts as transformative fair use or as mass infringement. If courts find in favor of the plaintiffs, the precedent could mandate licensing regimes across the publishing industry, forcing AI firms to pay for access to textual training data. But if the defendants prevail, it would enshrine a powerful legal shield for the use of expressive works in AI training, effectively allowing models to learn from vast cultural corpora without consent — so long as they don’t reproduce exact passages.
Meanwhile, in Andersen v. Stability AI and Getty Images v. Stability AI, the question is whether scraping millions of copyrighted images to build AI art generators amounts to unlawful copying. The courts are being asked not just whether the training process itself constitutes infringement but also whether the models encode and regurgitate the visual DNA of their training sets. A ruling that recognizes AI-generated outputs as derivative works could trigger a wave of model purges, opt-out rights for artists, and real liability for companies that failed to filter copyrighted data. A win for the defendants might affirm that AI models are abstract learning systems, not memory machines, and determine that the internet’s visual archive is available for AI training.
Across these cases, open legal questions loom large: Is AI training fundamentally transformative, or is it industrial-scale copying? Can AI-generated material trained on copyrighted works be considered “derivative works” under the Copyright Act? And, crucially, will courts draw a hard line between commercial AI systems that compete with creative labor and stakeholders (like Google Books) in earlier cases whose findings favored fair use in service of public access?
The answers to these questions will define the boundaries of AI development for years to come. A win for rights holders could usher in a new licensing ecosystem with more friction for AI developers but also more equity for artists. A win for AI firms might speed innovation, but it would come at the cost of weakening creators’ control over how their works are used. Either way, these lawsuits will decide not just who owns the training data behind AI but what it means, in the 21st century, to own creative expression at all.
See things differently? AI Frontiers welcomes expert insights, thoughtful critiques, and fresh perspectives. Send us your pitch.
AI is increasingly being used for emotional support — but research from OpenAI and MIT raises concerns that it may leave some users feeling even worse.