It is by now a truism that data is a crucial resource in the digital era. Yet today access to data and the capacity to make use of data and to benefit from it are unevenly distributed. A new understanding of data is needed, one that takes into account a society-wide data sharing and value creation. This will solve power asymmetries related to data ownership and the capacity to use it, and fill the public value gap with regard to data-driven growth and innovation.
Public institutions are also in a unique position to safeguard the rule of law, ensure democratic control and accountability, and drive the use of data to generate non-economic value.
The “data sharing for public good” narratives have been presented for over a decade, arguing that privately-owned big data should be used for the public interest. The idea of the commons has attracted the attention of policymakers interested in developing institutional responses that can advance public interest goals. The concept of the data commons offers a generative model of property that is well-aligned with the ambitions of the European data strategy. And by employing the idea of the data commons, the public debate can be shifted beyond an opposition between treating data as a commodity or protecting it as the object of fundamental rights.
The European Union is uniquely positioned to deliver a data governance framework that ensures Business-to-Government (B2G) data sharing in the public interest. The policy vision for such a framework has been presented in the European strategy for data, and specific recommendations for a robust B2G data sharing model have been made by the Commission’s high-level expert group.
There are three connected objectives that must be achieved through a B2G data sharing framework. Firstly, access to data and the capacity to make use of it needs to be ensured for a broader range of actors. Secondly, exclusive corporate control over data needs to be reduced. And thirdly, the information power of the state and its generative capacity should be strengthened.
Yet the current proposal for the Data Act fails to meet these goals, due to a narrow B2G data sharing mandate limited only to situations of public emergency and exceptional need.
This policy brief therefore presents a model for public interest B2G data sharing, aimed to complement the current proposal. This framework would also create a robust baseline for sectoral regulations, like the recently proposed Regulation on the European Health Data Space. The proposal includes the creation of the European Public Data Commons, a body that acts as a recipient and clearinghouse for the data made available.
It is by now a truism that data is a crucial resource in the digital era. Manifold metaphors aim to express its strategic role, by framing it as akin to raw natural resources or infrastructure. The need to harness the power of data is expressed almost ubiquitously, across all sectors. Yet, in reality, access to data and the capacity to make use of data and to benefit from it are unevenly distributed.
The quantity of data harvested by commercial companies is growing. Data-driven business models are at the heart of corporate strategies, leading to an unprecedented centralization of not just economic power, but also broader social and political power. And adding to this, data held by corporate data holders is de facto their property, even if it retains legal status and economic connotations as a public good and as a non-rivalrous resource. These data holders function as gatekeepers, controlling and in many cases restricting access to data, largely through technological measures.[1]
The current data economy is therefore based primarily on the appropriation and extraction of social resources, for profit, through data. And in turn privatization of data serves corporate profits rather than the common good, in particular because data access, when provided, is skewed towards private actors. [2] We face a spiral movement that sucks value away from the public sphere and into the private sector. This very much resembles a “one-sided commons” where data is free to be captured and exploited by private actors but no obligation exists to contribute back[3] .
Data is often seen as an economic asset that should be exclusively available to private companies to realize economic gains and consolidate competitive advantages. Even data sharing is interpreted as means for attaining competitive value.[4] This attitude was the central element of submissions from industry stakeholders to the Commission’s public consultation on the Data Act, and in the industry letter sent to Commissioner Breton just before the publication of the proposal for this act. These responses portray data as a pure economic asset functioning in markets that should at all costs be protected from external interference – even if outcomes of such interference would be welfare-enhancing.
It is a fallacy to assume that commercial actors have the capacity to make full and optimal use of data. 85% of all-generated data is never used, not even once.[5] The main reason is that those who could create value by using the data have no access. Data monopolization bears therefore negative externalities on society, by limiting data-driven innovation and growth. And in turn, it is broader data usage that is aligned with the common good. Access to data should be seen as a means of not just achieving economic gains, but also of fulfilling a variety of individual users’ rights, such as freedom of expression, information, services and competition.[6]
We are therefore facing today not a shortage of data, but an asymmetry in access and capacity to use data. Such data asymmetry is at the heart of the current platformization trends and further strengthens the market dominance of very few tech corporations.[7] To justify their restrictive stance, companies typically invoke either the need to preserve their competitiveness in the market or to protect the privacy of users.[8]
Limited access to data is in particular a challenge for public bodies, as demonstrated by the recent Global Data Barometer, and the World Bank’s social contract for data. At the same time, the integrity and capacities of the public sector are questioned, while portraying private sector efforts as the “gold standard” of data-driven solutions.[9] Evgeny Morozov described this as the “big data, small government” trend: a shift from a prevalent fear of surveillance to a willingness to delegate public interventions to private companies. Technological solutionism and the glorification of private companies went hand in hand with distrust toward public action.[10]
What is needed is a new understanding of data that takes into account society-wide data sharing and value creation. Data is more than a simple commodity that can be simply extracted, commodified, controlled, and exploited to derive individual profits.[11] Data production and sharing also embody relations, which make it a common resource that can benefit the public interest.[12] In order to achieve this, the relevant institutional level needs to be identified, at which these collective interests can coalesce and be governed. Public institutions are also in a unique position to safeguard the rule of law, ensure democratic control and accountability, and to drive the use of data to generate non-economic value.
The Open Data movement has for over a decade promoted forms of data sharing in the public interest. These have become in some cases the standard approach to data governance, but have been also limited to government, academic and institutional milieus. And while the idea of data sharing for the public good has been gaining relevance among policymakers, the absence of a relevant data governance platform led to limited and fragmented results.[13] Proposals for the sharing of private data for the public good can be seen as a reversal of the concept of public sector information, as enshrined in the Open Data Directive. [14]
And in turn, the concept of corporate “data philanthropy” has led to only a few relevant initiatives. During the pandemic, several relevant cases of Business-to-Government (B2G) data sharing took the shape of private-public partnerships, such as Meta’s Data for Good initiative, or Google’s Community Mobility Reports. These laudable initiatives have reinvigorated public action worldwide in the field of health; yet, they constitute the exception to the rule, as they remain isolated efforts. The recently published “State of Open Data Policy: Repository of Recent Developments” confirms the worldwide lack of B2G data sharing initiatives. Private companies face a dilemma, where the perceived value of sharing data is limited, while the associated risks are high (at least potentially).[15] As a result, data is preferably stored in-house and utilized as a competitive asset instead of a society-oriented good.
The “data sharing for public good” narratives can be traced at least back to 2011, when the United Nations popularized the concept of “data commons”: using privately-owned big data for sustainable development and humanitarian action.[16] The concept of the data commons is crucial, as it defines both values and institutional setups necessary for valuing access and freedom to operate, over the power to appropriate.[17]
The idea of the commons has attracted the attention of policymakers interested in developing institutional responses that can advance public interest goals and that oppose a conceptualization of data as a pure economic commodity. Pioneered by the work of Elinor Ostrom on the commons as the governance of common-pool resources, this framework has subsequently been extended to information and data governance in the digital age.[18] Typically, these have taken the shape of open science and open-access initiatives.
Recent accounts have also advocated for common-pool governance frameworks to better protect users’ right to privacy and personal data protection.[19] Commons-based production systems have also been described as an alternative to neoliberal, extractive approaches characteristic of today’s capitalism.[20] More recently, contributions have advanced a commons-based framework to enhance users’ access rights to non-personal data as well as to collectively govern data via public data commons: institutional mechanisms that can facilitate the fulfillment of public interest goals. [21]
The concept of data commons is also relevant as a generative model of property that is well-aligned both with the structural characteristics of data as a resource, but also with the ambitions of European data governance policies. In 2018, Mariana Mazzucato argued that a public data repository should own the public’s data and use it to “shape the digital economy in a way that satisfies public needs.”[22]
Commons-based approaches stimulate “a virtuous circle between the spillovers from certain uses and the social demand for access and social goods.”[23] In this sense, data commons can also be understood as public infrastructure: one that can be consumed in a non-rivalrous way, for which demand is driven by productive activities, and which serves as input for a wide range of downstream goods and services, both public and private.[24]
By employing the concept of the data commons, the public debate can be shifted beyond an opposition between treating data as a commodity or protecting it as the object of fundamental rights. The commons become the third possible approach, offering a generative model that generates not just economic but also social value, and that serves to protect basic rights.
The limited success of voluntary action, coupled with the growing power imbalances, necessitates public intervention.[25] This will help to translate a moral imperative of data sharing into an actionable and sustainable framework, a foundation for further data-driven development of society.
The European Union is today in a unique position to make the “data for public good” vision a reality, secured by a strong mandate for B2G data sharing. This operation is necessary if Europe is to “become a leading role model for a society empowered by data to make better decisions – in business and the public sector,” as declared in the European Strategy for Data. It declares that “the winners of today will not necessarily be the winners of tomorrow.” Data sharing mandates for the public good can help ensure, that public institutions – and by virtue the whole society – will be among the winners, and not just commercial unicorns and corporations.
The strategy states that “making more data available and improving the way in which data is used is essential for tackling societal, climate and environment-related challenges, contributing to healthier, more prosperous and more sustainable societies.” Achieving this goal requires not only public Open Data efforts to continue but also securing the reverse: the availability of private data for public interest uses.
The strategy includes a powerful vision of the transformation of the data-driven markets. From the current state, in which the markets are dominated by a handful of Big Tech firms that hold the world’s data – to one in which common data spaces change the rules for accessing and using data, and thus redistribute value. The Data Act is proposed as a regulatory measure that ensures greater balance in the distribution of the value.
Thus, there are three connected objectives that need to be achieved in order to create a more just data-driven society. Firstly, access to data and the capacity to make use of it needs to be ensured for a broader range of actors. Secondly, exclusive corporate control over data needs to be reduced. And thirdly, the information power of the state and its generative capacity should be strengthened.[26] Business-to-government data sharing mandates can serve all of these goals.
Chapter V of the proposed Data Act introduces new rules for making privately-held data available to public sector bodies, but limits them to situations of exceptional need. These can occur in two situations, which are defined in article 15:
Concerning the first scenario, public emergencies are defined in article 2 (10) as “exceptional situations negatively affecting the population of the Union, a Member State or part of it, with a risk of serious and lasting repercussions on living conditions or economic stability, or the substantial degradation of economic assets in the Union or the relevant Member State(s).” Article 16 also stipulates that privately-held data cannot be requested by public bodies for law enforcement purposes. Recital 57 further clarifies that such a definition covers public health emergencies, emergencies resulting from environmental degradation and major natural disasters as well as human-induced major disasters, such as major cybersecurity incidents. In these situations, public bodies can request access to privately-owned data for free.
In the second scenario, public bodies can request data to prevent a public emergency and assist in recovery from a public emergency. Additionally, they can request the data in situations where it is necessary to fulfill a specific task in the public interest, that has been explicitly provided by law. In these circumstances, the requesting public body needs to demonstrate that there are no other available means to obtain such data, including existing obligations or purchasing the data on the market. In addition, the requesting public sector body must show that the obtaining of data would substantially reduce the administrative burden for data holders or other enterprises. As these justifications are not strongly related to emergency response, the Commission proposal allows for private sector compensation at the level of incurred marginal costs, plus a “reasonable margin” (article 20).
Across both of these scenarios, a set of obligations for public and private bodies engaging in B2G data sharing is defined. On the one hand, public bodies, when requesting access, need to define the purpose, demonstrate exceptional need, make sure that the request is proportionate to the need, and not make it available for reuse (article 17). In addition, they must destroy the data after having fulfilled the stated need, as well as take all appropriate measures to preserve the confidentiality of commercially sensitive information (article 19). On the other hand, private bodies, in line with the principle of data minimization, shall transmit as little data as possible and are obliged to make data available without undue delay. Furthermore, they have to pseudonymize the data insofar as the request can be fulfilled with pseudonymized data. An overall exception applies to the scope of the proposal where data sharing rules do not apply to small and micro enterprises – less than 50 employees and annual turnover and/or balance of less than €10M (article 14).
In article 21, the proposal sets the conditions for the reuse of privately shared data by third parties. Accordingly, the public sector body can share the data with individuals or organizations carrying out scientific research or with national statistical institutions and Eurostat. This can be done under the condition that the initial data holder is notified. To be eligible, third parties “shall act on a not-for-profit basis or in the context of a public-interest mission recognized in Union or Member State law.” In addition, they shall not be subject to “decisive influence” by commercial undertakings or should provide preferential access to their research results. When reusing the data, third pirates shall not use the data for any other purpose, implement technical and organizational measures to protect personal data, and destroy the data after having fulfilled the stated need.
Finally, in cases of cross-border requests, the requesting public body established in another MS must first notify the competent authority of that MS. After receiving notification, the competent authority must advise the requesting public sector body of the need, if any, to cooperate with public sector bodies of the MS in which the data holder is based, with the aim of reducing administrative burdens on the concerned data holder (article 22).
B2G data sharing policies have been adopted by a few EU Member States and serve as best practices and points of reference for a European regulation on the matter. First is the French law for a Digital Republic (Loi pour une République numérique) which allows public sector bodies to access data that is held privately but linked to a public entity and its activities. Second is the Finnish Forest Act, a sectoral regulation that establishes a public body tasked with gathering data about forestry. It can either access it based on a B2G data sharing mandate, purchase it from the market, or even collect it through crowdsourcing.
At the same time, there are no existing B2G data sharing rules at the European level, other than mere reporting obligations.[27] This confirms the conclusions of the HLEG report, which notes the lack of harmonized standards across Member States.
Recommendations for B2G data sharing have also been made by a range of states and international organizations across the world. The 2015 OECD report “Data Driven Innovation argues for better access to data by the public sector” justified by the non-exclusive nature of data. And the 2021 OECD “Recommendation of the Council on Enhancing Access to and Sharing of Data” urges governments to secure data access and sharing arrangements in the public interest.
A similar approach is recommended in the 2020 Indian “Report by the Committee of Experts on Non-Personal Data Governance Framework.” B2G data sharing is recommended based on “sovereign purpose (such as national security or legal requirements), public interest purpose (policymaking or better delivery of services), or economic purpose (to provide for a level playing field or for a monetary consideration).” Also, the UK National Data Strategy aims to increase data availability within the public sector, by increasing availability and access procedures between public and private entities.
The Commission’s proposal builds on extensive prior work and studies. The first reference to B2G data sharing rules was contained in the 2017 Commission Communication on Building a European data economy. At that time, the idea of a reverse Public Sector Information (PSI) framework was introduced, to address the issue of access to data in the public interest. The concept refers to the 2013 PSI Directive (the precursor to the Open Data Directive), which laid down rules for the reuse of public data. Analogously, reverse PSI was meant to facilitate access and reuse of privately-held data.
This approach was mentioned in the 2017 workshop on access to privately-held data for public bodies, where the Commission confirmed its overall ambition to foster access to privately-held data by public sector bodies. In the same year, the Midterm Review of the Digital Single Market strategy tasked the Commission to “additionally look at the access, under clearly defined conditions, of privately-held data for public administrations for the execution of their public interest tasks.”
A year later, the Commission set up a high-level expert group (HLEG) on B2G data sharing and tasked it with developing recommendations for B2G provisions. The final report of the group urged the Commission to take ambitious steps to unlock the societal benefits arising from data, and to address the increasing fragmentation of data markets at the Member States level. The HLEG identified an ongoing market failure, where the lack of available data was due to high prices charged by private entities, and to an overall lack of incentives to share data with public bodies.
To solve these problems, the HLEG proposed four recommendations. First, Member States were encouraged to develop national governance structures that could support B2G data sharing. Second, private and public bodies should create data stewards functions: individuals or teams within organizations that facilitate data sharing. Third, to overcome the problem of lack of incentives, the HLEG suggested giving public sector bodies preferential access conditions to certain categories of privately-held data. In addition, the report recommended that testing environments (sandboxes) are developed for public-private partnerships. Finally, to address ongoing market fragmentation, the HLEG urged the Commission to provide a minimum level of harmonization for B2G data sharing at the EU level, via horizontal rules based on “EU-wide public interest purposes.” A flexible regulatory framework was recommended, where the Member States would have had the capacity to make data sharing mandatory for purposes that are particularly relevant to their national or local priorities.
The European Commission’s proposal does not fulfill the ambitions of the European strategy for data, which presents a vision of data – including private data – used for the public good. Instead of proposing a framework for B2G data sharing and reuse, it offers only an ad hoc measure, to be used in emergencies and cases of special need.
A more ambitious option was considered, which was closer to the strong recommendations of the HLEG. It included a general mechanism for requesting the reuse of business data by public bodies, for any “duly justified purpose”, without the need to demonstrate exceptional situations. Ultimately, it was not chosen. The decision-making process that led to this is described in the Impact Assessment Report. In the report, the evidence is reduced to a simple economic cost/benefits analysis, which fails to capture any public value that B2G data sharing will generate.
It is a paradox that measures that flip the logic of the Digital Single Market, by introducing provisions that transfer data and associated value from the private to the public sector, are ultimately assessed in almost purely economic terms.
The precarious state of the evidence base is most visible in the section, where the social and environmental impact of the stronger policy option is meant to be analyzed. The authors admit that while substantial social benefits of B2G data sharing can be expected, the support studies failed to quantify them. Authors of an accompanying study argue that “Even if such data were available, indirect value and externalities would not be appropriately considered (such as qualitative improvements in a product or service, new functionalities, better environmental performance, etc.). These are elements that no existing study has been able to quantify reliably.”[28]
Ultimately, the documents accompanying the proposal do not provide clear evidence that there was an evidence-based reason to choose the weaker approach to B2G data sharing policies. As such, the decision should be seen as largely a political one – and surprising in the context of the Commission’s ambition both to create new data value, and to curb the power of the dominant market players.
The proposal for the Data Act that was presented by the European Commission in February 2022 can still be amended to establish a stronger public interest Business to Government data sharing framework in the EU – one that goes beyond emergency and exceptional need situations. Such a framework would supplement emerging sectoral approaches to B2G data sharing — such as for the recent proposal for a Regulation on a European Health Data Space.
We propose to modify the Commission’s approach to B2G data sharing in the Data Act by adding an obligation to make data available based on clearly defined public interest criteria. Our proposal also includes establishing an EU-level stewardship body to ensure data availability in the public interest: a public data commons.[29] This institution would steward the data not as a commodity, but as a shared asset, or a common good.
This approach largely aligns with the recommendations of the HLEG on B2G data sharing, and with the stronger policy option that was considered — but not selected — by the European Commission during the preparatory work for the Data Act proposal.
For data access in emergencies and exceptional need situations, our proposal largely maintains the mechanism proposed by the Commission (see the flow on the left side of the diagram below). Here, the framework introduced in article 15(a) of the proposal is sound and proportionate in making sure that data is available free of charge to public sector bodies to address needs related to public emergencies.
To further streamline the procedures related to public emergencies, we propose to also include the purposes of “preventing public emergencies” and “assisting with the recovery from public emergencies” in the same mechanism (i.e. in these cases data must be made available free of charge and must be deleted by the receiving public sector body after the need has subsided).
In addition, we are proposing a mechanism for private data that is shared for public interest purposes other than public emergencies (see the flow on the right side of the diagram below). Here the main additions to the proposal include the introduction of a public interest test carried out by national-level competent authorities and the introduction of the European Public Data Commons, a European body that acts as a recipient and clearinghouse for the data made available by businesses.
In the model we are proposing, the national competent authorities would be tasked to evaluate requests made by a public sector body for access to B2G data based on public interest. This requires the introduction of a definition of public interest into the proposal that can serve as the basis for such evaluation.
We acknowledge that there is no single definition of public interest, as different publics will have different ways of understanding public interest or public good. This points to a general rule, that the public interest definition should itself be subject to democratic, participatory deliberation and governance. There is no complete list of public interest purposes – at the same time, some public interest uses are obvious, such as those for securing public health and education, combatting the climate crisis, or ensuring strong and just public institutions. We propose to use the term in line both with multiple “data for good” proposals and with the European strategy for data. Nevertheless, we are assuming that this general concept will be translated into a more specific framework during the implementation of the Data Act.
If requests to share data pass the public interest test carried out by the national competent authorities (meaning they meet the public interest definition, are proportionate and — where relevant — the data can be delivered in anonymized or pseudonymized form), the requested data will be provided to the European Public Data Commons. It would then make the data available to the public sector body that made the original request.
The main purpose of the European Public Data Commons is to serve as an aggregator that brings together data made available in response to public interest-based data requests and to act as a steward of the aggregated data. Further, requests for access to the data (or aggregated data sets containing the data) would be evaluated by the European Public Data Commons and could be made by other public sector bodies, research institutions, non-governmental organizations and small and medium enterprises fulfilling public interest goals.
This body should also have the capacity to provide legal and technical expertise and serve as a competence center that supports and promotes the growth of B2G data sharing, and the reuse of such data. Finally, this institution should have a strong participatory governance model, ensuring that different stakeholders, including civil society and academia, are involved in decision-making.
Data holders (businesses) sharing data in response to public interest data sharing requests should be compensated in line with the compensation rules established in Article 20 of the proposed Data Act.
It is important to note that the European Public Data Commons that we are proposing here is not an open access commons – for example, an open data repository – that allows the reuse of the aggregated data by anyone and for any purpose. This means that the data stewarded by the European Public Data Commons does not automatically qualify as open data in the sense of the Open Data Directive. Instead, this data is stewarded based on clear societal objectives. This setup is a deliberate design choice that seeks to avoid data extraction by commercial entities that do not act in the public interest. Having said that, some data (especially non-personal data) might qualify to be shared as Open Data. Such data, upon a decision made by the Public Data Commons, should be made available through the European Open Data aggregator. Altogether, different data sharing arrangements should be seen as complementary and lying on one spectrum of data governance.