ClearlyDefined v2.0 adds support for LicenseRefs
One of the major focuses of the ClearlyDefined Technical Roadmap is the improvement in the quality of license data. As such, we are excited to announce the release of ClearlyDefined v2.0 which adds over 2,000 new well-known licenses it can identify. You can see the complete list of new non-SPDX licenses in ScanCode LicenseDB.
A little historical background, when Clearly Defined was first created, it was initially decided to limit the reported licenses to only those on the SPDX License List. As teams worked with the Clearly Defined data, it became clear that additional license discovery is important to give users a fuller picture of the projects they depend on. In previous releases of ClearlyDefined, licenses not on the SPDX License List were represented in the definition as NOASSERTION or OTHER. (See the breakdown of licenses in The most popular licenses for each language in 2023.)The v2.0 release of ClearlyDefined includes an update of ScanCode to v32 and the support of LicenseRefs to identify non-SPDX licenses. The license in the definition will now be a LicenseRef with prefix LicenseRef-scancode- if ScanCode identifies a non-SPDX license. This improves the license coverage in the ClearlyDefined definitions and consumers ability to accurately construct license compliance policies.
ClearlyDefined identifies licenses in definitions using SPDX expressions. The SPDX specification has a way to include non-SPDX licenses in license expressions.
A license expression could be a single license identifier found on the SPDX License List; a user defined license reference denoted by the LicenseRef-[idString]; a license identifier combined with an SPDX exception; or some combination of license identifiers, license references and exceptions constructed using a small set of defined operators (e.g., AND, OR, WITH and +)
— excerpt from SPDX Annexes: SPDX license expressions
Example change of a definition:
Coordinates | License BEFORE | License AFTER |
npm/npmjs/@alexa-games/sfb-story-debugger/2.1.0 | NOASSERTION | LicenseRef-.amazon.com.-AmznSL-1.0 |
Note: ClearlyDefined v2.0 also includes an update to ScanCode v32.
What does this mean for definitions?
This section includes a simplified description of what happens when you request a definition from ClearlyDefined. These examples only refer to the ScanCode tool. Other tools are run as well and are handled in similar ways.
When the definition already exists
Any request for a definition through the /definitions API makes a couple of checks before returning the definition:
If the definition exists, it checks whether the definition was created using the latest version of the ClearlyDefined service.
- If yes, it returns the definition as is.
- If not, it recomputes the definition using the existing raw results from the tools run during the previous harvest for the existing definition. In this case, the tool version will be earlier than ScanCode v32.
NOTE: ClearlyDefined does not support LicenseRefs from ScanCode prior to v32. For earlier versions of ScanCode, ClearlyDefined stores any LicenseRefs as NOASSERTION. In some cases, you may see OTHER when the definition was curated.
When the definition does not exist
If the definition does not exist:
- It will send a harvest request which will run the latest version of all the tools and produce raw results.
- From these raw results, it will compute a definition which might include a LicenseRef.
What does it mean if I still see NOASSERTION?
If you see NOASSERTION in the license expression, you can check the definition to determine the version of ScanCode in the “described”: “tools” section.
If ScanCode is a version earlier than v32, you can submit a harvest API request. This will run any tools for which ClearlyDefined now supports a later version. Once the tools complete, the definition will be recomputed based on the new results.
In some cases, even when the results are from ScanCode v32, you may still see NOASSERTION. Reharvesting when the ScanCode version is already v32 will not change the definition.
What does this mean for tools?
When adding ScanCode licenses to allow/deny lists, note the ScanCode LicenseDB lists licenses without the LicenseRef prefix. All LicenseRefs coming from ScanCode will start with LicenseRef-scancode-.
Tools using an Allow List
A recomputed definition may change the license to include a LicenseRef that you want to allow. All new LicenseRefs that are acceptable will need to be added to your allow list. We are taking the approach of adding them as they appear in flagged package-version licenses. An alternative is to review the ScanCode LicenseDB to proactively add LicenseRefs to your allow list.
Tools using a Deny List
Deny lists need to be exhaustive to prevent a new license from being allowed by default. It is recommended that you review the ScanCode LicenseDB to determine if there are LicenseRefs you want to add to the deny list.
Note: The SPDX License List also changes over time. A periodic review to maintain the Deny list is always a good idea.
Providing Feedback
As with any major version change, there can be unexpected behavior. You can reach out with questions, feedback, or requests. Find how to get in touch with us in the Get Involved doc.
If you have comments or questions on the actual LicenseRefs, you should reach out to ScanCode License DB maintainers.
Acknowledgements
A huge thank you to the contributing developers and their organizations for supporting the work of ClearlyDefined.
In alphabetical order, contributors were…
- ajhenry (GitHub)
- brifl (Microsoft)
- elrayle (GitHub)
- jeff-luszcz (GitHub)
- ljones140 (GitHub)
- lumaxis (GitHub)
- mpcen (Microsoft)
- nickvidal (Open Source Initiative)
- qtomlinson (SAP)
- RomanIakovlev (GitHub)
- yashkohli88 (SAP)
See something you’d like ClearlyDefined to do or could do better? If you have resources to help out, we have work to be done to further improve data quality, performance, and sustainability. We’d love to hear from you.
References
Tags: clearlydefined, News
Leave a Reply