Open Source AI Definition – weekly update Mar 4
A weekly summary of interesting threads on the forum.
The results from the working groups are in
The groups that analyzed OpenCV and Bloom have completed their work and the results of the votes have been published.
We now have a full overview of the result of the four (Llama2, Pythia, Bloom, OpenCV) working groups and the recommendations that they have produced.
Access our spreadsheet to see the complete overview of the compiled votes. This is a major milestone of the co-design process.
Discussion on access to training data continues
This conversation continues with a new question: What does openness look like when original datasets are not accessible due to privacy preserving?
- “Privacy preserving” is contradictory to open source per definition. Also, we must consider that open source AI might not always be appropriate to use in all contexts
- Unanswered remains a previous question about the OSI’s role in mentioning the lack of high-quality datasets available to train and fine-tune models, that is, if we consider the original training data to be unnecessary when modifying the AI system. Should the OSI mention this in the final definition?
Is the definition of “AI system” by the OECD too broad?
Central question: How can an “AI system” be precisely defined to avoid loopholes and ensure comprehensive coverage under open-source criteria?
“AI system” might create loopholes in open-source licensing, potentially allowing publishers to avoid certain criteria.
Though, defining “AI system” is useful to clarify what constitutes an open-source AI, needed to outline necessary components, like sharing training code and model parameters, while acknowledging the need for further work on aspects such as model architecture.
If you have not already seen, our fourth town hall meeting was held on the 23/02-2024. Access the recording here and the slides here.
A new townhall meeting is scheduled for this week.
Tags: ai, Deep Dive: AI, News
Leave a Reply