Filling the governance vacuum on the use of information commons for AI training



The use of openly licensed photos of faces for the purpose of training AI facial recognition systems has been raised in recent years as one of the controversial use cases for Creative Commons-licensed content.

Since the case received media attention in 2019, it has often been referred to as an example of the inherent conflict between openness and privacy. And of the extraction of value from the commons by corporations.

With AI_Commons, we explored how AI training datasets and openly licensed works included in those datasets can be better governed and shared as a commons.

The case created an opportunity to ask essential questions about the challenges that open licensing faces today, related to privacy, exploitation of the commons at massive scales of use, or dealing with unexpected and unintended uses of works that are openly licensed.

As part of this activity, we also commissioned Adam Harvey to conduct a study on the use of Creative Commons licenses for AI training datasets and Selkie Study to conduct research on the use of openly licensed photographs and machine learning. Furthermore, Aniek Kempeneers conducted a study of design solutions for the case as her MSc graduation project in the DCODE Labs at the Delft University of Technology. In addition, we published an in-depth white paper on understanding the implications of face recognition training with CC-licensed photographs.

AI_Commons ended with the publication of our report “AI_Commons – Filling the governance vacuum on the use of information commons for AI training.” The report summarizes our findings and offers recommendations for commons-based governance of AI datasets.


Read the report



A research report presenting results of a survey conducted by Selkie Study as part of our AI_Commons initiative. The survey allowed us to gather insights from users of photo-sharing platforms, on the use of content that they shared openly for AI training. The main objective of this study was to identify possible points of controversy around the usage of open content for the development of AI technologies. It also enabled us to outline directions for further research and for supporting users in understanding and reacting to incidents related to the development of AI technologies.

keep up to date
and subscribe
to our newsletter