Skip to content

A Proposed GenAI Training License

One of the most contentious and unresolved debates in artificial intelligence (AI) today centers on a deceptively simple question: What data should developers be allowed to use when training generative AI models—and under what terms? With lawsuits proliferating, rights holders alarmed, and developers pushing forward, the United Kingdom may soon introduce a novel solution that could redefine the debate.

The Copyright Licensing Agency (CLA), a not-for-profit collective rights management organization in the UK, has announced that it is developing a Generative AI Training License, slated for publication in the third quarter of 2025. This initiative could prove a landmark step toward resolving the legal uncertainty around using copyrighted content in AI training datasets.

Why This Matters

In recent years, generative AI systems—like large language models (LLMs) and image generators—have become commonplace in both consumer and enterprise technology. However, their rise has been accompanied by a troubling fact: many of these models have been trained on vast bodies of data scraped from the internet, including copyrighted content, published books, journal articles, and other works under IP protections.

In the absence of a clear legal framework, this has led to mounting concern from authors, publishers, artists, and other creators, who argue that their works are being used without permission or compensation. At the same time, developers argue that limiting access to training data stifles innovation and is impractical given the scale of data required to build modern AI systems.

The CLA’s proposed GenAI Training License is a possible solution.

What Is the GenAI Training License?

Described by the CLA as “groundbreaking” and a “milestone initiative,” the GenAI Training License seeks to establish a scalable, collective licensing system for AI training—one that mirrors long-established licensing regimes used in fields like music, education, and art reproduction.

The goal is twofold:

  1. To guarantee compensation for authors and publishers whose works are used in training data.
  2. To provide legal certainty and efficient access for developers seeking to license data for training purposes.

A Possible Solution to AI Tension with Creators

The promise of this license lies in its potential to de-risk AI development while ensuring fair compensation for creators.

From the perspective of AI developers, a collective license offers predictability. Instead of navigating an impossibly complex web of individual copyright permissions, they would have access to a standardized agreement covering the use of large amounts of data.

From the perspective of creators, particularly smaller or independent ones, it offers a way to participate in and benefit from the AI economy—where previously their works might have been used without permission, compensation, or even acknowledgment.

This kind of license could also help reduce the growing number of lawsuits filed by rights holders who are seeking to challenge the legality of AI training practices. As the litigation landscape around GenAI continues to intensify globally, a licensing solution offers a pragmatic, scalable alternative to years of court battles.

Practical Implications

1. Shift in AI Business Risk Profiles

One of the most immediate effects of the license may be on AI-related commercial transactions. Investors, acquirers, and enterprise customers often conduct IP due diligence on GenAI systems—only to discover a legal grey area regarding training data.

The availability of a legally sound license could reduce this ambiguity and lower the risk premium for AI companies operating in or serving the UK market.

2. Global Ramifications

Though the license is UK-based, it may have international influence, especially if successful. Other jurisdictions—including the EU and the US—are grappling with similar copyright vs. AI challenges. A working UK model could serve as a blueprint or at least a proof-of-concept for other collective licensing systems.

3. A More Level Playing Field for Small Players

Large tech companies may have the legal and financial resources to create bespoke licensing deals or settle disputes. Smaller developers and independent creators do not. A collective license makes participation feasible for these actors on both sides, democratizing access and compensation.

Conclusion: A Step Toward Copyright-Aware AI

As the AI revolution continues to accelerate, the question of how to balance innovation and intellectual property is not going away. The CLA’s Generative AI Training License represents one of the most thoughtful and promising attempts yet to address this tension in a fair and workable manner.

While it won’t resolve every challenge in AI and copyright, it offers something sorely missing in current debates: a viable middle path. If executed effectively, this initiative could become a landmark in both UK copyright law and the global governance of AI systems.