A Journey toward defining Open Source AI: presentation at Open Source Summit Europe
A few weeks ago I attended Open Source Summit Europe 2024, an event organized by the Linux Foundation, that brought together brilliant developers, technologists and leaders from all over the world, reinforcing what Open Source is truly about—collaboration, innovation and community.
I had the honor of leading a session that tackled one of the most critical challenges in the Open Source movement today—defining what it means for AI to be “Open Source.” Along with OSI Board Director Justin Colannino, we presented the v.0.0.9 for the Open Source AI Definition. This session marked an important milestone for both the Open Source Initiative (OSI) and the broader community, a moment that encapsulated years of collaboration, learning and exploration.
The story behind the Open Source AI Definition
Our session, titled “The Open Source AI Definition Is (Almost) Ready” was more than just a talk—it was an interactive dialogue. As Justin kicked off the session, he captured the essence of the journey we’ve been on. OSI has been grappling with what it means to call AI systems, models and weights “Open Source.” This challenge comes at a time when companies and even regulations are using the term without a clear, agreed-upon definition.
From the outset, we knew we had to get it right. The Open Source values that have fueled so much software innovation—transparency, collaboration, freedom—needed to be the foundation for AI as well. But AI isn’t like traditional software, and that’s where our challenge began.
The origins: a podcast and a vision
When I first became Executive Director of OSI, I pitched the idea of exploring how Open Source principles apply to AI. We spent months strategizing, and the more we dove in, the more we realized how complex the task would be. We didn’t know much about AI at the time, but we were eager to learn. We turned to experts from various fields—a copyright lawyer, an ethicist, AI pioneers from Eleuther AI and Debian ML, and even an AI security expert from DARPA. Those conversations culminated in a podcast we created called Deep Dive AI, which I highly recommend to anyone interested in this topic.
Through those early discussions, it became clear that AI and machine learning are not software in the traditional sense. Concepts like “source code,” which had been well-defined in software thanks to people like Richard Stallman and the GNU GPL, didn’t apply 1:1 to AI. We didn’t even know what the “program” was in AI, nor could we easily determine the “preferred form for making modifications”—a cornerstone of Open Source licensing.
This realization sparked the need to adapt the Open Source principles we all know so well to the unique world of AI.
Co-designing the future of Open Source AI
Once we understood the scope of the challenge, we knew that creating this definition couldn’t be a solo endeavor. It had to be co-designed with the global community. At the start of 2023, we had limited resources—just two full-time staff members and a small budget. But that didn’t stop us from moving forward. We began fundraising to support a multi-stakeholder, global conversation about what Open Source AI should look like.
We brought on Mer Joyce, a co-design expert who introduced us to creative methods that ensure decisions are made with the community, not for it. With her help, we started breaking the problem into smaller pieces and gathering insights from volunteers, AI experts and other stakeholders. Over time, we began piecing together what would eventually become v.0.0.9 of the Open Source AI Definition.
By early 2024, we had outlined the core principles of Open Source AI, drawing inspiration from the free software movement. We relied heavily on foundational texts like the GNU Manifesto and the Four Freedoms of software. From there, we built a structure that mirrored the values of freedom, collaboration and openness, but tailored specifically to the complexities of AI.
Addressing the unique challenges of AI
Of course, defining the freedoms was only part of the battle. AI and machine learning systems posed new challenges that we hadn’t encountered in traditional software. One of the key questions we faced was: What is the preferred form for making modifications in AI? In traditional software, this might be source code. But in AI, it’s not so straightforward. We realized that the “weights” of machine learning models—those parameters fine-tuned by data—are crucial. However, data itself doesn’t fit neatly into the Open Source framework.
This was a major point of discussion during the session. Code and weights need to be covered by an OSI-approved license because they represent the modifiable core of AI systems. However, data doesn’t meet the same criteria. Instead, we concluded that while data is essential for understanding and studying the system, it’s not the “preferred form” for making modifications. Instead, the data information and code requirements allow Open Source AI systems to be forked by third-party AI builders downstream using the same information as the original developers. These forks could include removing non-public or non-open data from the training dataset, in order to retrain a new Open Source AI system on fully public or open data. This insight was shaped by input from the community and experts who joined our study groups and voted on various approaches.
The road ahead: a collaborative future
As we wrap up this phase, the next step is gathering even more feedback from the community. The definition isn’t final yet, and it will continue to evolve as we incorporate insights from events like this summit. I’m incredibly grateful for the thoughtful comments we’ve already received from people all over the world who have helped guide us along this journey.
At the core of this project is the belief that Open Source AI should reflect the same values that have made Open Source a force for good in software development. We’re not there yet, but together, we’re building something that will have a lasting impact—not just on AI, but on the future of technology as a whole.
I want to thank everyone who has contributed to this project so far. Your dedication and passion are what make Open Source so special. Let’s continue to shape the future of AI, together.
Leave a Reply