Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue with CC BY-SA license identifiers #618

Closed
bradleeedmondson opened this issue Mar 13, 2018 · 15 comments
Closed

Issue with CC BY-SA license identifiers #618

bradleeedmondson opened this issue Mar 13, 2018 · 15 comments
Labels
minor updates to file URL update, notes update, etc.
Milestone

Comments

@bradleeedmondson
Copy link
Contributor

Issue raised by Bradley Kuhn on the spdx-legal list:

I'm helping Behan Webster's E-ALE project properly describe and document the
license of their project in their repository. (E-ALE is a large tutorial
project -- with materials pulled together from various sources -- and they
are primarily licensed under various versions of CC BY-SA.)

Of course, being a good SPDX citizen, I was going to use SPDX identifiers
where appropriate to describe the licenses. But I'd never looked closely
(before today) at the identifiers available for CC licenses in SPDX.

I noticed two key problems when I did:

  • To describe the pre-4.0 versions of CC BY-SA, there are only identifiers
    for the "Unported" versions. So, the SPDX identifier cannot actually be
    used to replace situations where someone, say, has used the "CC BY-SA 3.0
    United States" license.

  • The Full Names of the licenses on https://spdx.org/licenses/ are
    incomplete. Most say something like:
    "Creative Commons Attribution Share Alike "

    But that's not the full name of the license according to CC.

I therefore suggest two changes to the SPDX License List:

  • Change existing Full Names to:
    "Creative Commons Attribution Share Alike 4.0 International"
    for the 4.0 version and,
    "Creative Commons Attribution Share Alike Unported"
    for the older ones.

    It seems that would be an uncontroversial change -- it just involves
    adding "International" and "Unported" into the Full Name field. Does
    anyone have an argument why that shouldn't be done?

  • It would surely be controversial to add every version of every
    jurisdiction-specific CC license in the SPDX license list. Instead of
    suggesting that, for the moment I suggest that "-Unported" should be
    added to identifier for the pre-4.0 ones (i.e., "CC-BY-SA-3.0" becomes
    "CC-BY-SA-3.0-Unported") so that no is confused by this situation.

@bradleeedmondson
Copy link
Contributor Author

First response, from David Wheeler:

Bradley M. Kuhn:

I therefore suggest two changes to the SPDX License List:

  • Change existing Full Names to:
    "Creative Commons Attribution Share Alike 4.0 International"
    for the 4.0 version and,
    "Creative Commons Attribution Share Alike Unported"
    for the older ones.

    It seems that would be an uncontroversial change -- it just involves
    adding "International" and "Unported" into the Full Name field. Does
    anyone have an argument why that shouldn't be done?

I agree.

  • It would surely be controversial to add every version of every
    jurisdiction-specific CC license in the SPDX license list. Instead of
    suggesting that, for the moment I suggest that "-Unported" should be
    added to identifier for the pre-4.0 ones (i.e., "CC-BY-SA-3.0" becomes
    "CC-BY-SA-3.0-Unported") so that no is confused by this situation.

I disagree, for several reasons.

  • Version numbers are normally at the end.
  • In practice, I think in almost all cases what is intended is the unported/international version, since these materials normally go out around the world. SPDX license names are long enough; the "short" version should be the "normal" version.
  • This creates yet-another transition problem, and in this case I think an unnecessary one. Many people already use CC-BY-SA-3.0 to mean the unported one, so let's just clarify that.

I actually do NOT think it'd be very controversial to add all the jurisdiction-specific CC licenses that are actually used:

  • There's an easy stopping requirement: You have to show that something was actually distributed under that license. Almost all of the possible license + countries combinations have never been used.
  • There are SPDX license identifiers for licenses used by relatively few programs.
  • You could create a convention, e.g., -PORTED-< ISO 3166-1 alpha-2 country code>-<VERSION_NUMBER>. The SPDX license identifier list could even standardize that as a convention, instead of listing them all out. A few lines of text... and you're done. E.g., the US ported version would be "CC-BY-SA-PORTED-US-3.0". I add the "-PORTED-" because "SA" means "Saudi Arabia"; without some special keyword it wouldn't be obvious what "CC-BY-SA" meant. I suggest the 2-character code, that's what most people use. We could use the "2-character codes as assigned by the Internet Assigned Numbers Authority". The alpha-2 code for the UK is "GB", but "UK" is used in domain names & it might be clearer to use that.

@bradleeedmondson
Copy link
Contributor Author

I'll want to think more about this, but I do have initial, limited comments:

  • I do think that version numbers are normally at the end, but I don't think that should control here. I believe that the nomenclature for CC licenses has been CC-BY (at least, this is how I've always used them and seen them on Wikimedia Commons). For example: CC-BY-SA 3.0 US, CC-BY-NC 4.0 Intl, CC-BY-ND 2.0 unported, etc. I think we should stick with this convention, rather than adopting a new one.
  • Normally I would agree with David's suggestion to rely on established country codes, but in this case I think the potential for ambiguity means the standardization isn't worth it. As he points out, SA is Saudi Arabia, and NC is New Caledonia. ND isn't assigned, but isn't reserved for user assignment either.
  • I do, however, like David's suggestion to require showing that a license is in use before adding it. This is already a requirement for inclusion on the SPDX License List, and by continuing to observe it here we can pick out the licenses that will be worth adding to the list.
  • The issue of translations is different but somewhat overlapping. However, a CC license need not be translated to be "ported" to comply with jurisdictional requirements, so even if we were to explicitly decide that translated licenses would not be separately represented on the SPDX License List (i.e., the choice that results in the fewest number of entries on the License List), we would still need to address this issue of unambiguously identifying ported, unported, international/4.0 CC licenses.

@david-a-wheeler
Copy link

I think it's very important that the Unported/International version be an unmarked "default" case, and that the existing markers be used for that. It's relatively rare that anyone would use a "ported" case for software, and creating new license identifiers for the normal case creates a lot of unnecessary.

I still think the version number should be at the end, since that's really a different kind of information. Having a simple rule like "version number is always at the end" is very helpful.

I don't think using country codes is ambiguous - you just need to ensure the syntax isn't ambiguous in context. A next marker like "-PORTED-" does the job nicely. Putting the country code after the version number does also do that, though I'm less excited about that approach.

@mlinksva
Copy link
Contributor

mlinksva commented Mar 13, 2018

All of the "ports" that CC has made are listed in https://creativecommons.org/licenses/index.rdf (note one, igo, does not correspond to a country code).

I'd go with the scheme reflected in their URLs, e.g., https://creativecommons.org/licenses/by-sa/3.0/de/ -> CC-BY-SA-3.0-DE.

Some CC licenses have official translations, but as noted that is a different thing. Each "port" should be considered a different license. Whether it's worth adding them all to this repository is another question.

cc @robmyers for possible thoughts from CC.

@david-a-wheeler
Copy link

@mlinksva - you could document the convention, instead of individual examples, and then cross-reference to https://creativecommons.org/licenses/index.rdf. There's nothing wrong with a specification referring to another specification. Up to this point SPDX has listed every entry separately, but there's no strict need for that. While I like having version numbers at the end, if that really isn't what is done elsewhere, I can see the value in following existing conventions from others.

@silverhook
Copy link
Collaborator

I was about to suggest using an underscore to make it more transparent that the country code belongs to the version (e.g. CC-BY-SA-3.0_DE), but then noticed that it seems that we don’t allow for that character in the specs

In that case, I think @mlinksva’s suggestion is probably most fitting. I also agree that we should list only the specific language versions if they pop up (often enough). Otherwise, we could go through the whole spiel again with EUPL.

@mlinksva
Copy link
Contributor

I also agree that we should list only the specific language versions if they pop up (often enough). Otherwise, we could go through the whole spiel again with EUPL.

Just to emphasize, that's a different issue (#438). Some CC licenses do have multiple official languages (notably CC-*-4.0 i.e., "unported" licenses, CC0-1.0, and certain jurisdiction "ports" for ones you might expect such as Canada and Spain) though not as many as EUPL has!

@silverhook
Copy link
Collaborator

Arguably the EUPL translations say the same or not. In the end what matters is that all. I agree that the CC and EUPL handle this a bit differently (as well as CC up to v3 vs CC v4), thanks for reminding me.

@jlovejoy
Copy link
Member

Hi all - so, sounds like there are two questions here:

  1. add "unported" or "international" to full names of licenses existing on SPDX License List. I agree this is non-controversial and could be done for 3.1 (if someone want to help :)
  2. bigger question of adding more licenses - this probably needs more thought/discussion and thus can be tagged for a later release.
    Agree?

@DennisClark
Copy link

Hi all - I agree completely with the recent comments from @jlovejoy

  • no problem with the full name change
  • we should definitely wait for a later release to do anything else

@jlovejoy
Copy link
Member

jlovejoy commented Apr 5, 2018

note: change to full names was accepted in 3.1 release. leaving issue open for other discussion here.

@larsgw
Copy link
Contributor

larsgw commented Jul 14, 2021

I do, however, like David's suggestion to require showing that a license is in use before adding it. This is already a requirement for inclusion on the SPDX License List, and by continuing to observe it here we can pick out the licenses that will be worth adding to the list.

Is the "in use" requirement specific to usage directly related to software? I am building a dataset of various books and articles with a specific topic and if they have a ported CC license I would still like to be able to refer to it with SPDX identifiers. However, although I am using the identifiers in a dataset (+ an accompanying website), the licenses themselves are used by non-software items.

See #1285 (accepted, PR #1291) and #1295, but I am encountering more.

@seabass-labrax
Copy link
Contributor

seabass-labrax commented Jul 20, 2021

Is the "in use" requirement specific to usage directly related to software?

No, usage other than in software is completely fine for fulfilling this requirement, @larsgw :) Indeed, there are emerging uses of SPDX documents in fields such as hardware, so the non-software licenses are likely to be an increasingly important part of the SPDX License List.

I am building a dataset of various books and articles with a specific topic and if they have a ported CC license I would still like to be able to refer to it with SPDX identifiers.

Sounds great! We always like to hear about how SPDX license identifiers are being used, so feel free to bring this up on the SPDX Legal Team mailing list or our IRC channel (#spdx on Libera.Chat).

@swinslow
Copy link
Member

swinslow commented Dec 9, 2021

Looking back at this old issue:

  1. For the naming question, it looks like the "Full Name" field for the CC licenses has been better aligned with CC's names for them -- though there are still some minor differences in spacing and hyphenation. E.g., for CC-BY-SA-3.0 the SPDX "Full Name" says "Creative Commons Attribution Share Alike 3.0 Unported" but the CC name appears to be "Creative Commons Attribution-ShareAlike 3.0 Unported". This could be worth cleaning up.
  2. For the translations / ports, I think over the past few years since this original discussion, we've gotten into a pretty good cadence and practice of adding translations if and when they're shown to have real-world use, and to handle the version numbers as seen at https://spdx.org/licenses/. So I think we're basically all set on this.

So let's see if there is a volunteer to pick up item 1 above, if so then we can assign this to them, or otherwise I think we can close this one out.

@jlovejoy
Copy link
Member

jlovejoy commented Dec 9, 2021

I'd say close it. Looks like we don't have dashes in the full names, but for one (a more recent addition). I think that was a result of this issue and making things consistent. Probably should have been closed back then!

@jlovejoy jlovejoy closed this as completed Dec 9, 2021
@jlovejoy jlovejoy added minor updates to file URL update, notes update, etc. and removed discuss on legal call labels Dec 9, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
minor updates to file URL update, notes update, etc.
Projects
None yet
Development

No branches or pull requests

9 participants