Filling the policy vacuum on AI training with open datasets

The use of openly licensed photos of faces for the purpose of training AI facial recognition systems has been raised in recent years as one of controversial use cases for Creative Commons licensed content.

Since the case received media attention in 2019, it has been often raised as an example of inherent conflict between openness and privacy. And of the extraction of value from the commons by corporations.

In the background, there are growing concerns about the ethics of artificial intelligence and machine learning technologies, especially in relation to biometric data.

We launched the AI_Commons research activity to find a solution to this issue. By studying this case, we hope to define better how governance of shared resources can balance open sharing with protection of personal data and privacy. We also see this as a case that concerns irrevocability of CC licenses and their unintended uses, and thus the challenge of making the CC licensing stack future-proof.

Finally, this is a case that explores the limit of the Open Access Commons approach to sharing. We are exploring whether for some types of data we need a stronger, more managed commons and data governance.

This initiative is part of our work on Data Commons and is also a key case illustrating the Paradox of Open.

We are running this research in collaboration with Adam Harvey, an artist and technologist from the project. We commissioned Adam to conduct an independent study of CC licensing in the context of datasets for AI facial recognition training. You can read the report  “The Exploitation of Photography: How Creative Commons Licenses Enable Surveillance” on Adam Harvey’s webpage.



Adam Harvey on CC licensing of AI training datasets

May 31, 2022 by: Open Future
An essay from Adam Harvey on CC licensing of AI training datasets building on the research we have commissioned from him as part of the AI_Commons initiative.

Exploring design solutions for AI_Commons When users’ biometric data becomes open content

February 18, 2022 by: Francesco Vogelezang
As part of the AI_Commons research activity, Open Future partnered with Aniek Kampeneers to investigate how design concepts for consent practices can provide an added layer of protection for users’ privacy in the online sphere.

AI_Commons Open licensing in the age of AI

April 29, 2021 by: Francesco Vogelezang et al.
Openly licensed photographs of faces have been broadly used for the training of AI facial recognition systems. This is widely presented as an example of corporate extraction of value from the commons, yet no policy or governance solution has been provided to this challenge.


March 7, 2022

Mozilla Festival 2022: Exploring Design solutions for AI_Commons

March 4, 2022
On Monday 7 March at 19:00 CET, Francesco and Aniek Kempeneers co-hosted a session dedicated to AI_Commons at Mozilla Festival 2022.
September 22, 2021

AI_Commons at CC Global Summit 2021

December 1, 2021
Discussing with the broad Open Movement how to reconcile Open licensing and AI training