Reimagining data for Open Source AI: A call to action

Artificial intelligence (AI) is changing the world at a remarkable pace, with Open Source AI playing a pivotal role in shaping its trajectory. Yet, as AI advances, a fundamental challenge emerges: How do we create a data ecosystem that is not only robust but also equitable and sustainable?

The Open Source Initiative (OSI) and Open Future have taken a significant step toward addressing this challenge by releasing a white paper: “Data Governance in Open Source AI: Enabling Responsible and Systematic Access.” This document is the culmination of a global co-design process, enriched by insights from a vibrant two-day workshop held in Paris in October 2024.

A turning point for Open Source AI

At its core, this white paper addresses a pressing question: How can we responsibly govern the data that fuels Open Source AI? The answer requires a profound transformation in how we think about data. It’s not just a resource to exploit but a shared commons—a collective foundation upon which innovation can flourish while respecting rights and fostering equity.

Open Source AI thrives on shared datasets. Yet, the current landscape is fraught with challenges:

  • Openness and transparency: Many AI models labeled “open” lack transparency regarding data provenance, licensing and usage restrictions, creating confusion about what truly constitutes Open Source AI.
  • Data scarcity and inequity: Despite the vast amount of information on the internet, many datasets are of low quality and fail to represent the diversity of our world.
  • Privacy concerns: Some data cannot be legally shared due to varied laws across jurisdictions concerning personal data and international human rights standards on the right to privacy.
  • Stakeholder representation: The AI ecosystem often prioritizes developers and corporations over contributors, affected communities, and public interest organizations.
  • Environmental sustainability: AI’s resource-intensive nature raises concerns about its environmental impact..

A vision for change

The white paper offers a blueprint for a data ecosystem rooted in fairness, inclusivity and sustainability. It calls for two transformative shifts:

  1. From Open Data to Data Commons: Moving beyond the notion of unrestricted data to a model that balances openness with the rights and needs of all stakeholders.
  2. Broadening the stakeholder universe: Creating collaborative frameworks that unite communities, stewards and creators in equitable data-sharing practices.

To bring these shifts to life, the white paper delves into six critical focus areas:

  • Data preparation
  • Preference signaling and licensing
  • Data stewards and custodians
  • Environmental sustainability
  • Reciprocity and compensation
  • Policy interventions

Each focus area is a stepping stone toward building a future where data empowers rather than exploits, where it reflects the diversity of human experience rather than reinforcing systemic inequities.

A call to action

This white paper is an invitation to the global community to reimagine the role of data in Open Source AI. It challenges us to:

  • Collaborate across sectors, from open data and open science to cultural institutions.
  • Empower communities, particularly in underserved regions, to shape how their data is used.
  • Prioritize smaller, localized AI models that reflect specific contexts and needs, reducing reliance on monolithic systems.

The release of this white paper marks a pivotal moment in the evolution of Open Source AI. It represents the collective wisdom of data governance and Open Source experts worldwide, coalescing around a shared vision of fairness, inclusivity, and sustainability. We hope this resource will catalyze the conversation around training data in Open Source AI.

Read the full white paper and join us. Together, we can create a world where data is both a resource and a shared foundation for equitable innovation.

About

Dr. Alek Tarkowski is the Strategy Director at Open Future. He holds a PhD in sociology from the Polish Academy of Science. He has over 15 years of experience with public interest advocacy, movement building, and research into the intersection of society, culture, and digital technologies.

The OSI is the authority that defines Open Source, recognized globally by individuals, companies, and by public institutions.

Open Future is a European think tank that develops new approaches to an open internet that maximize societal benefits of shared data, knowledge and culture.

Click Here to View Original Source (opensource.org)

Leave a Reply

Your email address will not be published. Required fields are marked *

Shared by: voicesofopensource

Tags: , , , ,