Licensing, Levies, and the Limits of Copyright

A First Look at the Voss Report
July 14, 2025

The irony of MEP Axel Voss’s draft own-initiative report on copyright and generative AI lies in its underestimation of the potential of the very framework that Voss himself helped shape over the past decade. In the explanatory memorandum accompanying the draft report, published last Friday, Voss writes that when it comes to the interplay between copyright law and innovation:

“the principle must apply that technological developments must respect existing laws and, on the other hand, existing laws must not hinder technological developments.”

This is precisely what happened in the case of generative AI. In the face of its sudden emergence as a major technological paradigm in late 2022, the EU copyright framework—which Voss played a key role in shaping—proved to be surprisingly well-calibrated to address the challenges posed by this new technology. More specifically, the newly introduced exceptions for text and data mining (TDM) provided a technologically neutral and flexible approach to how copyrighted works and other subject matter are used in the training of AI systems.

By defining TDM as any “automated analytical techniques aimed at analyzing text and data in digital form in order to generate information, which includes but is not limited to patterns, trends and correlations,” the EU copyright framework elegantly captured the way copyrighted works are used in AI training. This definition brought the technology within the scope of existing law, providing legal clarity: such uses are generally permissible under an exception—provided that the works have been accessed lawfully—and, outside the context of scientific research, only as long as rightsholders have not opted out.

Somewhat unsatisfactorily, Voss’s draft report spends considerable time arguing that the exception in Article 4 of the CDSM Directive does not cover the use of copyrighted works for training AI models—only to conclude that what is needed is an exception with essentially the same structure: a lawful access requirement and the ability for rightholders to opt out.

In Recommendation 8, he suggests that this could be achieved “either through the introduction of a dedicated exception to the exclusive rights to reproduction and extraction, distinct from that provided for TDM under Article 4 of the CDSM Directive, or by expanding the scope of that provision to explicitly encompass the training of GenAI, which is currently not covered.”

So in practice, what he is proposing is effectively a reiteration of the approach already adopted by the CDSM Directive. The fact that he ultimately circles back to the very structure he initially questions mirrors the recent trajectory of UK policy. There, the government also set out to define a copyright framework for AI training that would improve upon the EU approach—which the UK had itself championed as an EU member—only to arrive at the insight that the EU framework already strikes a reasonable balance between the competing interests of rightholders and AI model developers.

This conceptual back-and-forth—questioning and ultimately reaffirming the approach introduced by Article 4 of the CDSM Directive—almost overshadows the fact that Voss’s report also addresses two important areas where the existing EU copyright framework is clearly lacking: remuneration and transparency.

Converging ideas about Fair Remuneration

In his report, Voss rightly acknowledges that the “new and specific form of use” of copyrighted works for training AI systems—which is in fact enabled, or more accurately, shielded from legal challenges by Article 4 of the CDSM Directive—has begun to “undermine the economic sustainability of the creative sector in the European Union.” The report further notes that there are numerous impediments to the emergence of a functioning licensing market for this kind of use, such as asymmetries in market power and the risk of exclusion from licensing deals. It therefore concludes that a remuneration mechanism is needed to address the use of copyrighted works in the training of generative AI models.

While the report’s actual recommendation is light on detail—which perhaps explains why Voss’s proposed interim measure of a 5–7% flat fee on the global revenue of generative AI companies has attracted the most media attention—his core analysis and focus are nonetheless correct. He rightly identifies the risk that the unremunerated use of copyrighted content for training foundation models creates a systemic imbalance in the information ecosystem, threatening the economic foundations of cultural and creative production in Europe.

Interestingly, this is also the point at which his analysis—and to a lesser extent, his proposed solution—converges with the arguments laid out in our recent white paper on the impact of generative AI on the sustainability of the information ecosystem. Both recognize that the training of foundation models on publicly accessible content results in a transfer of value (and market power) that the existing copyright framework was not designed to handle. Both also argue that market-based licensing mechanisms are unlikely to scale or provide meaningful compensation to the wide array of contributors whose work sustains the public information commons.

The key shortcoming of Voss’s approach is that, despite correctly identifying the need for fair remuneration, his report offers little guidance on how such a system should function. The aspiration to “uphold a framework in which fair remuneration mechanisms enable the generation of the resources needed for European artistic and creative production to thrive in the context of AI-driven global transformation” is directionally correct—but remains too vague. Crucial questions such as who should pay, on what basis, and who should be compensated are left unanswered. While the lack of detail may be understandable given the nature of an INI report, it nonetheless highlights the need for further policy work to translate this political diagnosis into an effective and inclusive remuneration mechanism.

While it is clear that Voss envisions remuneration flowing to copyright holders, the first two questions—who should pay and on what basis—prove much more difficult. It appears that he is advocating for a licensing model tied to acts of reproduction that occur during both training and inference.

However, the report offers little insight into how the amount of remuneration or licensing fees should be determined. His interim solution—a flat fee of 5–7% of global revenue—may appear more concrete at first glance, but it also lacks key details: whose revenue is being taxed, and revenue derived from what specific activity? These are not peripheral concerns—they are central to any functioning remuneration mechanism.

As we argue in our white paper, focusing solely on acts of reproduction is unlikely to produce meaningful results for anyone except the largest rightsholders. It risks entrenching existing industry hierarchies while doing little to support the broader ecosystem of individual creators, smaller publications, public interest actors, and commons-based projects that also contribute to the public information commons.

That is why, in our white paper, we propose a redistributive mechanism based on a levy applied to revenue generated by “AI systems or other types of service based on models trained on publicly available information.” While this model also requires further elaboration—particularly in determining the appropriate levy rate and how the proceeds should be distributed—its strength lies in attaching the obligation to the actual market deployment of generative AI systems and services. This provides a far more meaningful attachment point than attempting to license individual acts of reproduction, especially in a context where the economic value often only becomes clear after deployment. Here, Voss’s draft report provides an opening to further develop a levy-based approach, rather than licensing.

Too much transparency? 

The other issue where Voss is right to press for improvement is the question of training data transparency. While we are still awaiting the AI Office’s transparency template, all indications suggest that it will be woefully inadequate for achieving meaningful transparency. As we have previously argued, this is problematic not only from the perspective of creators and other rightsholders—who want to understand if and how their works are being used by AI model developers—but also from a broader fundamental rights perspective. Transparency around training data is essential for public accountability, democratic oversight, and the ability to contest harmful uses in AI systems that increasingly mediate our access to information and culture.

In his report, Voss takes a strong position on transparency, calling for  “full, actionable transparency and source documentation by providers and deployers of general-purpose AI models and systems […] for any purpose, including for inferencing, retrieval-augmented generation, or fine-tuning”. He further proposes that, in cases where models fail to meet these transparency requirements, there should be an “irrebuttable presumption” that they have been trained on copyright-protected works—making their developers liable for their use.

While the ambition behind this proposal is laudable, some elements risk going too far in practice. This is particularly true of his call to expand transparency obligations to include inference-time uses, such as retrieval-augmented generation. Requiring model developers and deployers to report every interaction their systems have with online content back to rightsholders could result in overwhelming amounts of unactionable information, creating noise rather than clarity.

Fortunately, there may be a more practical path to addressing the legitimate interests of rightsholders without requiring expansive transactional transparency. From a copyright perspective, the lack of full training data transparency becomes significantly less problematic if remuneration is addressed through a levy model, as discussed in the previous section. Such a model would operate on assumptions similar to those underlying Voss’s proposed “irrebuttable presumption,” thereby eliminating the need for itemized reporting that would otherwise be required for individual licensing schemes.

An Opportunity for Structural Reform

Even though Voss begins from a somewhat contradictory position—questioning the applicability of a copyright exception that he helped shape—his draft report ultimately identifies a real and urgent challenge. In terms of material impact on the sustainability of information production, generative AI does indeed introduce novel pressures that are not satisfactorily addressed by the current copyright framework alone. In recognizing this, the report marks an important step forward.

In that sense, Voss’s report is an opportunity. It offers an opening to come to grips with how generative AI is reshaping the conditions under which knowledge, culture, and information are produced and accessed. While the focus of the report remains narrowly within the copyright logic, its diagnosis of the broader economic and structural effects of AI systems opens the door to a more holistic conversation.

We may disagree on the scope and orientation of the solution. Voss’s approach stays within the bounds of traditional copyright thinking, centered on individual rightholders and licensing mechanisms. By contrast, our white paper calls for a broader framing—one that includes the full spectrum of contributors to the public information commons, including those whose work is not governed by copyright or who contribute without commercial intent.

But in terms of overall direction, the Voss report provides attachment points worth building on. Its focus on remuneration, recognition of systemic imbalances, and emphasis on transparency provide promising starting points for further development. If we take these openings seriously, it will be possible to develop a framework for a more sustainable European information ecosystem—one that ensures fairness for creators, information producers, and other rightholders, while also supporting the long-term viability of the institutions, infrastructures, and digital commons on which sustainable access to knowledge depends.

Paul Keller
keep up to date
and subscribe
to our newsletter
Subscribe