In late April, the IETF AI Preferences Working Group adopted our opt-out vocabulary proposal—published in our policy brief on a vocabulary for opting out of AI training and other forms of Text and Data Mining—as the starting point for developing a vocabulary to express AI preferences on the open internet.
The IETF AI Preferences Working Group was formed earlier this year to standardise building blocks for expressing preferences about how content is collected and processed for the development, deployment and use of artificial intelligence models, i.e. a vocabulary. In addition to developing a vocabulary, the working group has also been tasked with improving the existing Robots Exclusion Protocol (often referred to as robots.txt) to make it more suitable as a mechanism for expressing AI preferences, such as opting out of generative AI training.
Our vocabulary proposal—which we had developed based on input from a wide range of stakeholders as part of the ongoing implementation of the AI Act—was discussed extensively during a three-day working group meeting in Brussels in April.
Much of this discussion focused on whether the proposed vocabulary could be modified to function as an international standard by making it less focused on the EU regulatory context. The workshop participants identified a number of issues that need to be addressed in order to achieve this. These include making the language of the proposal less Euro-centric and adding categories covering the use of content for search and inference. In addition, workshop participants identified the need to develop an applicability statement to help stakeholders, including regulators, understand the purpose and scope of the vocabulary.
In approving the proposed vocabulary as a starting point for the future work of the IETF working group, the working group chairs also appointed Paul Keller and Martin Thomson (Mozilla) as editors of the document. Together, they will be responsible for incorporating feedback from the working group and reaching consensus on a final draft. According to the working group charter, this work should be completed by August 2025. While much of this work will take place online (on the Working Group’s public mailing list), the Working Group has also scheduled another two-day Design Team meeting to take place in London in July.
If all goes to plan, this will result in a more robust way of expressing AI training preferences including opt-outs based on the EU copyright rules via robots.txt and other opt-out systems. As such, the adoption of our vocabulary by the IETF working group is an important step towards achieving the objective of developing a practical framework for rights holder opt-outs that is consistent with EU copyright law, fits within the standards-based architecture of the open internet, and can help preserve the open information commons.