
The debate over artificial intelligence and copyright has reached a critical juncture as Microsoft and OpenAI continue to invoke fair use doctrine in response to mounting lawsuits from news publishers. At the heart of the controversy is whether training large language models on copyrighted news articles—and then generating answers that often reproduce or closely paraphrase that content—violates intellectual property law. The tech giants argue that their use falls under fair use, a legal defense intended to allow limited use of copyrighted material for purposes such as criticism, comment, news reporting, teaching, scholarship, or research. However, critics contend that the scale and commercial nature of AI training, combined with the direct substitution for original news articles, undermines the very purpose of copyright and accelerates the destruction of local journalism.
The fair use argument in AI
Fair use is a complex and case-specific doctrine, balancing four factors: the purpose and character of the use, the nature of the copyrighted work, the amount used, and the effect on the potential market. OpenAI and Microsoft have consistently argued that using publicly available text—including news articles—to train AI models constitutes a transformative use, because the AI does not simply reproduce content but learns patterns of language and generates new text. They also claim that the training process does not directly harm the market for the original articles, as the AI's outputs are not substitutes for news reporting but rather tools for answering questions, summarizing information, or generating creative content.
This defense has been deployed in several high-profile lawsuits, including cases brought by The New York Times, The Wall Street Journal's parent company, and other major publishers. In its motion to dismiss the Times lawsuit, OpenAI argued that the newspaper's claims were unfounded because the AI's outputs rarely replicated full articles verbatim and that the training data was used in a manner consistent with long-standing copyright principles. Microsoft, as a major investor in OpenAI, has supported these arguments, pointing to its own Copilot product as a tool that enhances productivity rather than infringes copyright.
Local journalism under siege
While legal battles involving major publishers attract significant attention, the greatest impact of AI on journalism may be felt at the local level. Local newspapers, already struggling with declining advertising revenues, shrinking readership, and widespread consolidation, now face an existential threat from AI-powered search and content generation. Tools like ChatGPT and Microsoft Copilot enable users to get answers to questions without clicking through to original news sources, depriving local news sites of critical web traffic and advertising income. This phenomenon, described by some as the "death knell for local journalism," has been exacerbated by search engines and social media platforms that increasingly rely on AI to synthesize information rather than direct users to individual articles.
A 2024 report from the Medill School of Journalism at Northwestern University found that local newspapers in the United States are closing at a rate of more than two per month, leaving vast swaths of the country without any reliable source of local news. Many of these publications had already been hollowed out by layoffs and paywalls, and the rise of generative AI threatens to accelerate their decline. The problem is not limited to the United States; similar trends are observed in Canada, the United Kingdom, and Australia, where government efforts to force tech companies to pay for news have only partially mitigated the damage.
Critics argue that the fair use defense fails to account for the unique public interest served by local journalism. Unlike entertainment or general news, local reporting covers school board meetings, city council decisions, crime statistics, and community events that are not easily replaced by synthesized summaries. When an AI model can answer a query like "What was the outcome of the recent zoning board vote in Anytown?" by pulling scattered pieces of information from multiple articles, it effectively eliminates the need for a reader to visit the local paper's website. Over time, this reduces the incentive for publishers to invest in original reporting, creating a downward spiral that harms democratic accountability.
The economic consequences of AI-driven search
At the core of the dispute is the question of whether AI-generated answers constitute fair use or whether they function as market substitutes. News publishers rely on a business model in which readers pay either directly through subscriptions or indirectly through advertising based on page views and engagement. AI tools that provide instant answers reduce both forms of revenue. A study from the News Media Alliance estimated that if current trends continue, the news industry could lose billions of dollars annually in advertising and subscription income due to AI summarization and search features.
Microsoft's integration of Copilot into Bing search and Office products has made these concerns particularly acute. When users ask Copilot about current events, the tool often provides detailed responses that synthesize information from multiple sources, sometimes citing those sources but sometimes failing to provide clear attribution. This has led to accusations that Copilot is effectively repackaging journalism without compensation. Even when attribution is provided, the user has little incentive to click through, as the key information is already displayed in the chat interface.
OpenAI's ChatGPT, which now has a search feature of its own, has faced similar criticism. In response, the company has announced licensing deals with a handful of major publishers, including Axel Springer, the Associated Press, and Le Monde. However, these deals are widely seen as insufficient to address the widespread use of content from thousands of smaller publications that lack the negotiating power to demand fair terms. Local news outlets, already operating on thin margins, cannot afford to wait for the legal system to catch up.
Legal precedents and ongoing disputes
Despite the industry's alarm, early court rulings have not been entirely unfavorable to the AI companies. In May 2025, a federal judge in New York dismissed parts of a class-action lawsuit brought by authors and publishers against OpenAI, citing the transformative nature of AI training. The judge noted that while the AI may have been trained on copyrighted works, the resulting outputs were not direct copies and served a different purpose. This ruling, though preliminary, gave a boost to the fair use defense. However, the New York Times case, which involves more specific allegations of verbatim reproduction and competitive harm, remains pending and could set a more significant precedent.
Meanwhile, Microsoft has argued that its Copilot product operates within the boundaries of fair use because it only generates content based on publicly available information and user prompts. In a recent court filing, the company stated that "Copilot does not reproduce news articles in their entirety; it helps users find and interpret information faster." Critics counter that this argument ignores the fact that the AI's training data consisted of whole articles, and that the mere ability to reconstruct substantial portions of those articles on demand undermines the copyright holder's exclusive rights.
The debate has also spilled over into regulatory arenas. The European Union's AI Act, which took effect in 2024, requires companies training AI models to disclose their use of copyrighted material, but does not establish a clear framework for compensation. In the United States, several bills have been introduced in Congress to address AI and copyright, but none have advanced to a vote. The Biden administration has called for voluntary agreements between AI companies and content creators, but these have not materialized at scale.
The broader implications for information ecosystems
Beyond the legal and economic dimensions, the fair use debate highlights fundamental questions about the future of information consumption. If AI models can provide accurate, timely answers to questions without sending users to original sources, the entire ecosystem of content creation and distribution could be upended. This would not only affect news outlets but also blogs, academic journals, reference works, and any other form of written content that relies on traffic or licensing revenue. The fair use doctrine, originally conceived for a world of analog copying and limited distribution, may not be well suited to the realities of large-scale AI training and generation.
Some legal scholars have proposed a middle ground: a compulsory licensing system or a small royalty fee for each use of copyrighted material in AI training. Others suggest requiring AI companies to provide clear attribution and links to original sources in all generated outputs. But these solutions face opposition from both sides—publishers may demand higher compensation than what a license system typically provides, while AI companies argue that such requirements would stifle innovation and make products too expensive.
In the meantime, local journalists continue to sound the alarm. A coalition of regional newspaper editors recently released an open letter calling on Microsoft and OpenAI to "stop using our work without permission" and to engage in good-faith negotiations for fair compensation. The letter warned that "every day that passes without a solution brings us closer to the extinction of local journalism." Despite these pleas, both companies have maintained their fair use position, indicating that they are willing to litigate the issue for years if necessary.
The stakes are enormous. If the courts ultimately reject the fair use defense and find that AI training infringes copyright, it could force fundamental changes to how large language models are built, potentially requiring opt-in consent for every piece of copyrighted material. On the other hand, if fair use is affirmed, it could accelerate the decline of original content creation, particularly in sectors like local journalism that already operate on razor-thin margins. Either outcome, experts warn, will have profound consequences for the public's access to reliable, independently produced news.
As the legal and regulatory battles unfold, one thing is clear: the current trajectory is unsustainable. The fair use card, as played by Microsoft and OpenAI, may be legally defensible in some contexts, but it does not address the existential threat posed to journalism by the very tools these companies are building. The burden now falls on courts, lawmakers, and the tech industry itself to find a path forward that preserves both innovation and the public interest.
Source:Windows Central News
