The use of openly licensed photos of faces for the purpose of training AI facial recognition systems has been raised in recent years as one of controversial use cases for Creative Commons licensed content.
Since the case received media attention in 2019, it has been often raised as an example of inherent conflict between openness and privacy. And of the extraction of value from the commons by corporations.
In the background, there are growing concerns about the ethics of artificial intelligence and machine learning technologies, especially in relation to biometric data.
We launched the AI_Commons research activity to find a solution to this issue. By studying this case, we hope to define better how governance of shared resources can balance open sharing with protection of personal data and privacy. We also see this as a case that concerns irrevocability of CC licenses and their unintended uses, and thus the challenge of making the CC licensing stack future-proof.
Finally, this is a case that explores the limit of the Open Access Commons approach to sharing. We are exploring whether for some types of data we need a stronger, more managed commons and data governance.
This initiative is part of our work on Data Commons and is also a key case illustrating the Paradox of Open.
We are running this research in collaboration with Adam Harvey, an artist and technologist from the Exposing.ai project. We commissioned Adam to conduct an independent study of CC licensing in the context of datasets for AI facial recognition training. You can read the report “The Exploitation of Photography: How Creative Commons Licenses Enable Surveillance” on Adam Harvey’s webpage.