Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Detection rule needed for "3-Clause BSD License" #3069

Open
DennisClark opened this issue Aug 26, 2022 · 2 comments
Open

Detection rule needed for "3-Clause BSD License" #3069

DennisClark opened this issue Aug 26, 2022 · 2 comments

Comments

@DennisClark
Copy link
Contributor

A scan using scancode-toolkit-31.0.2 of tomviz-2.0.0-rc1
tomviz-2.0.0-rc1-results.json.zip

( available from https://github.com/OpenChemistry/tomviz/archive/refs/tags/2.0.0-rc1.tar.gz )
returned the correct Declared license of bsd-new from the project LICENSE file, but it returned 425 instances of unknown-license-reference from files that contained the following string:

It is released under the 3-Clause BSD License, see "LICENSE". */

For example, the string can be found in file tomviz-2.0.0-rc1/tests/cxx/AcquisitionClientTest.cxx

Since "3-Clause BSD License" is an obvious reference to bsd-new, this problem can probably be fixed with an addition or modification to the license detection rules.

Scan results attached.

@AyanSinhaMahapatra
Copy link
Contributor

AyanSinhaMahapatra commented Sep 5, 2022

This is already fixed in the LicenseDetection branch for the upcoming release: https://github.com/nexB/scancode-toolkit/tree/add-license-detection.

How this is solved:

Issue 1

  1. We see a unknown-license-reference present in a license match which has a referenced_filename (i.e. here the text is referring to a file named LICENSE) (This also happens if this is not unknown, i.e. we verify with the referenced fileeven if we detect correctly)
  2. We search for a file named LICENSE present in the same directory or at the scan root. (Here it was present at scan root)
  3. If we find a file, we check if the file has valid License detections.
  4. If the file has valid license detections, we remove the unknown-license-reference and add the license expression found in the file with this detection.

Let me paste a sample example of how a LicenseDetection of the text It is released under the 3-Clause BSD License, see "LICENSE". */ looks currently:

Here "unknown-reference-to-local-file" means a unknown license reference to a local file was resolved successfully.


      "detected_license_expression": "bsd-new",
      "detected_license_expression_spdx": "BSD-3-Clause",
      "license_detections": [
        {
          "license_expression": "bsd-new",
          "detection_rules": [
            "unknown-reference-to-local-file"
          ],
          "matches": [
            {
              "score": 100.0,
              "start_line": 2,
              "end_line": 2,
              "matched_length": 6,
              "match_coverage": 100.0,
              "matcher": "2-aho",
              "license_expression": "bsd-new",
              "rule_identifier": "bsd-new_682.RULE",
              "referenced_filenames": [],
              "is_license_text": false,
              "is_license_notice": true,
              "is_license_reference": false,
              "is_license_tag": false,
              "is_license_intro": false,
              "rule_length": 6,
              "rule_relevance": 100,
              "matched_text": "under the 3-Clause BSD License,",
              "licenses": [
                {
                  "key": "bsd-new",
                  "name": "BSD-3-Clause",
                  "short_name": "BSD-3-Clause",
                  "category": "Permissive",
                  "is_exception": false,
                  "is_unknown": false,
                  "owner": "Regents of the University of California",
                  "homepage_url": "http://www.opensource.org/licenses/BSD-3-Clause",
                  "text_url": "http://www.opensource.org/licenses/BSD-3-Clause",
                  "reference_url": "https://scancode-licensedb.aboutcode.org/bsd-new",
                  "scancode_text_url": "https://github.com/nexB/scancode-toolkit/tree/develop/src/licensedcode/data/licenses/bsd-new.LICENSE",
                  "scancode_data_url": "https://github.com/nexB/scancode-toolkit/tree/develop/src/licensedcode/data/licenses/bsd-new.yml",
                  "spdx_license_key": "BSD-3-Clause",
                  "spdx_url": "https://spdx.org/licenses/BSD-3-Clause"
                }
              ]
            },
            {
              "score": 100.0,
              "start_line": 2,
              "end_line": 2,
              "matched_length": 2,
              "match_coverage": 100.0,
              "matcher": "2-aho",
              "license_expression": "unknown-license-reference",
              "rule_identifier": "unknown-license-reference_see-license_1.RULE",
              "referenced_filenames": [
                "LICENSE"
              ],
              "is_license_text": false,
              "is_license_notice": false,
              "is_license_reference": true,
              "is_license_tag": false,
              "is_license_intro": false,
              "rule_length": 2,
              "rule_relevance": 100,
              "matched_text": "see \"LICENSE\". */",
              "licenses": [
                {
                  "key": "unknown-license-reference",
                  "name": "Unknown License file reference",
                  "short_name": "Unknown License reference",
                  "category": "Unstated License",
                  "is_exception": false,
                  "is_unknown": true,
                  "owner": "Unspecified",
                  "homepage_url": null,
                  "text_url": "",
                  "reference_url": "https://scancode-licensedb.aboutcode.org/unknown-license-reference",
                  "scancode_text_url": "https://github.com/nexB/scancode-toolkit/tree/develop/src/licensedcode/data/licenses/unknown-license-reference.LICENSE",
                  "scancode_data_url": "https://github.com/nexB/scancode-toolkit/tree/develop/src/licensedcode/data/licenses/unknown-license-reference.yml",
                  "spdx_license_key": "LicenseRef-scancode-unknown-license-reference",
                  "spdx_url": "https://github.com/nexB/scancode-toolkit/tree/develop/src/licensedcode/data/licenses/unknown-license-reference.LICENSE"
                }
              ]
            },
            {
              "score": 100.0,
              "start_line": 4,
              "end_line": 27,
              "matched_length": 216,
              "match_coverage": 100.0,
              "matcher": "2-aho",
              "license_expression": "bsd-new",
              "rule_identifier": "bsd-new_105.RULE",
              "referenced_filenames": [],
              "is_license_text": true,
              "is_license_notice": false,
              "is_license_reference": false,
              "is_license_tag": false,
              "is_license_intro": false,
              "rule_length": 216,
              "rule_relevance": 100,
              "matched_text": "Redistribution and use in source and binary forms, with or without\nmodification, are permitted provided that the following conditions are met:\n\n1. Redistributions of source code must retain the above copyright notice, this\n    list of conditions and the following disclaimer.\n\n2. Redistributions in binary form must reproduce the above copyright notice,\n    this list of conditions and the following disclaimer in the documentation\n    and/or other materials provided with the distribution.\n\n3. Neither the name of the copyright holder nor the names of its contributors\n    may be used to endorse or promote products derived from this software\n    without specific prior written permission.\n\nTHIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS \"AS IS\" AND\nANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED\nWARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE\nDISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE\nFOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL\nDAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR\nSERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER\nCAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR\nTORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF\nTHIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.",
              "licenses": [
                {
                  "key": "bsd-new",
                  "name": "BSD-3-Clause",
                  "short_name": "BSD-3-Clause",
                  "category": "Permissive",
                  "is_exception": false,
                  "is_unknown": false,
                  "owner": "Regents of the University of California",
                  "homepage_url": "http://www.opensource.org/licenses/BSD-3-Clause",
                  "text_url": "http://www.opensource.org/licenses/BSD-3-Clause",
                  "reference_url": "https://scancode-licensedb.aboutcode.org/bsd-new",
                  "scancode_text_url": "https://github.com/nexB/scancode-toolkit/tree/develop/src/licensedcode/data/licenses/bsd-new.LICENSE",
                  "scancode_data_url": "https://github.com/nexB/scancode-toolkit/tree/develop/src/licensedcode/data/licenses/bsd-new.yml",
                  "spdx_license_key": "BSD-3-Clause",
                  "spdx_url": "https://spdx.org/licenses/BSD-3-Clause"
                }
              ]
            }
          ]
        }
      ],
      "license_clues": [],
      "percentage_of_license_text": 0.97,

Issue 2

There was also another issue present in this example (which is another type of license issue we are solving):

The declared_license_expression for the detected package is bsd-new AND free-unknown in your scan and this should be just bsd-new .

Here it was "unknown-intro-followed-by-match" i.e. an unknown intro was there followed by a proper detection and so this unknown can be removed. This is achieved by tagging specific rules as is_license_intro as True.

License scan results for the package when scanned from the LicenseDetection branch:


"declared_license_expression": "bsd-new",
      "declared_license_expression_spdx": "BSD-3-Clause",
      "license_detections": [
        {
          "license_expression": "bsd-new",
          "detection_rules": [
            "not-combined"
          ],
          "matches": [
            {
              "score": 100.0,
              "start_line": 1,
              "end_line": 1,
              "matched_length": 3,
              "match_coverage": 100.0,
              "matcher": "1-hash",
              "license_expression": "bsd-new",
              "rule_identifier": "bsd-new_10.RULE",
              "referenced_filenames": [],
              "is_license_text": false,
              "is_license_notice": false,
              "is_license_reference": true,
              "is_license_tag": false,
              "is_license_intro": false,
              "rule_length": 3,
              "rule_relevance": 100,
              "matched_text": "BSD 3-Clause",
              "licenses": [
                {
                  "key": "bsd-new",
                  "name": "BSD-3-Clause",
                  "short_name": "BSD-3-Clause",
                  "category": "Permissive",
                  "is_exception": false,
                  "is_unknown": false,
                  "owner": "Regents of the University of California",
                  "homepage_url": "http://www.opensource.org/licenses/BSD-3-Clause",
                  "text_url": "http://www.opensource.org/licenses/BSD-3-Clause",
                  "reference_url": "https://scancode-licensedb.aboutcode.org/bsd-new",
                  "scancode_text_url": "https://github.com/nexB/scancode-toolkit/tree/develop/src/licensedcode/data/licenses/bsd-new.LICENSE",
                  "scancode_data_url": "https://github.com/nexB/scancode-toolkit/tree/develop/src/licensedcode/data/licenses/bsd-new.yml",
                  "spdx_license_key": "BSD-3-Clause",
                  "spdx_url": "https://spdx.org/licenses/BSD-3-Clause"
                }
              ]
            }
          ]
        },
        {
          "license_expression": "bsd-new",
          "detection_rules": [
            "unknown-intro-followed-by-match"
          ],
          "matches": [
            {
              "score": 100.0,
              "start_line": 1,
              "end_line": 1,
              "matched_length": 3,
              "match_coverage": 100.0,
              "matcher": "2-aho",
              "license_expression": "free-unknown",
              "rule_identifier": "pypi_osi_approved.RULE",
              "referenced_filenames": [],
              "is_license_text": false,
              "is_license_notice": false,
              "is_license_reference": false,
              "is_license_tag": false,
              "is_license_intro": true,
              "rule_length": 3,
              "rule_relevance": 100,
              "matched_text": "License :: OSI Approved ::",
              "licenses": [
                {
                  "key": "free-unknown",
                  "name": "Free unknown license detected but not recognized",
                  "short_name": "Free unknown",
                  "category": "Unstated License",
                  "is_exception": false,
                  "is_unknown": true,
                  "owner": "Unspecified",
                  "homepage_url": null,
                  "text_url": "",
                  "reference_url": "https://scancode-licensedb.aboutcode.org/free-unknown",
                  "scancode_text_url": "https://github.com/nexB/scancode-toolkit/tree/develop/src/licensedcode/data/licenses/free-unknown.LICENSE",
                  "scancode_data_url": "https://github.com/nexB/scancode-toolkit/tree/develop/src/licensedcode/data/licenses/free-unknown.yml",
                  "spdx_license_key": "LicenseRef-scancode-free-unknown",
                  "spdx_url": "https://github.com/nexB/scancode-toolkit/tree/develop/src/licensedcode/data/licenses/free-unknown.LICENSE"
                }
              ]
            },
            {
              "score": 100.0,
              "start_line": 1,
              "end_line": 1,
              "matched_length": 3,
              "match_coverage": 100.0,
              "matcher": "2-aho",
              "license_expression": "bsd-new",
              "rule_identifier": "bsd-new_10.RULE",
              "referenced_filenames": [],
              "is_license_text": false,
              "is_license_notice": false,
              "is_license_reference": true,
              "is_license_tag": false,
              "is_license_intro": false,
              "rule_length": 3,
              "rule_relevance": 100,
              "matched_text": "BSD 3-Clause']",
              "licenses": [
                {
                  "key": "bsd-new",
                  "name": "BSD-3-Clause",
                  "short_name": "BSD-3-Clause",
                  "category": "Permissive",
                  "is_exception": false,
                  "is_unknown": false,
                  "owner": "Regents of the University of California",
                  "homepage_url": "http://www.opensource.org/licenses/BSD-3-Clause",
                  "text_url": "http://www.opensource.org/licenses/BSD-3-Clause",
                  "reference_url": "https://scancode-licensedb.aboutcode.org/bsd-new",
                  "scancode_text_url": "https://github.com/nexB/scancode-toolkit/tree/develop/src/licensedcode/data/licenses/bsd-new.LICENSE",
                  "scancode_data_url": "https://github.com/nexB/scancode-toolkit/tree/develop/src/licensedcode/data/licenses/bsd-new.yml",
                  "spdx_license_key": "BSD-3-Clause",
                  "spdx_url": "https://spdx.org/licenses/BSD-3-Clause"
                }
              ]
            }
          ]
        }
      ],
      "other_license_expression": null,
      "other_license_expression_spdx": null,
      "other_license_detections": [],
      "extracted_license_statement": "{'license': 'BSD 3-Clause', 'classifiers': ['License :: OSI Approved :: BSD 3-Clause']}",
      

@AyanSinhaMahapatra
Copy link
Contributor

Attaching the full scan results here:

tomviz.json.txt

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants