Skip to content

initial implementation of scan, parse, and satisfies#1

Merged
elrayle merged 2 commits into
mainfrom
elr/scan-and-parse
Sep 5, 2022
Merged

initial implementation of scan, parse, and satisfies#1
elrayle merged 2 commits into
mainfrom
elr/scan-and-parse

Conversation

@elrayle
Copy link
Copy Markdown
Collaborator

@elrayle elrayle commented Aug 29, 2022

Major Components

  • license - defines active, deprecated, and exception licenses; defines license ranges
    NOTE: Currently hardcoded. May want to get from external canonical source in the future to avoid longterm maintenance to keep the lists up to date.
  • scan - breaks expression string into tokens
  • Parse - validates and creates nodes for expected parts
  • node - defines node structures and holds basic comparison/validation code for two license nodes
  • compare - compares to license ids are equal or within compatible ranges
  • Satisfies - checks if two license expressions are compatible

Remaining Work

Satisfies needs to be extended to support simple conjunctive expressions (e.g. "MIT AND Apache-2.0", "MIT OR Apache-2.0", etc.) and complex deep tree expressions (e.g. " (MIT AND Apache-1.0+) OR DocumentRef-spdx-tool-1.2:LicenseRef-MIT-Style-2 OR (GPL-2.0 WITH Bison-exception-2.2)").

includes start of comparison methods needed for Satisfy
* start of satisfies translation from js to go
* add comparator code to determine >, <, = for two licenses
* add ranges for licenses
* normalize -or-later, -only, + in license identifier

Known direct license comparisons that fail in go, but pass in js

```
satisfies(“Apache-3.0”, “Apache-2.0+”) // Apache-3.0 doesn’t exist
satisfies(“Apache-1.0+”, “Apache-2.0+”) // not in same range
satisfies(“GPL-2.0”, “GPL-2.0+”) // not in same range
satisfies(“GPL-2.0”, “GPL-2.0-or-later”) // not in same range
```
Copy link
Copy Markdown
Contributor

@dangoor dangoor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, and definitely looks like some tricky porting from JavaScript!

There's no need to make a change like this, but I thought I'd mention it in case it makes Satisfies clearer/easier to implement. This is fresh in my mind having just worked on validation code that has to work across types… What I'm about to say only makes sense if Satisfies only takes a particular license or GPL-2.0-or-later kind of thing and not full expressions.

If you wanted Node to be something polymorphic, you could think in terms of what operations need to be performed. Satisfies is the big one, but maybe there are others that could be performed (greater than, less than). You then make an interface (e.g. Node) that's like Satisifes(string) bool. Then you can have a License struct that implements that interface and contains info about a single license (or a license range, whatever makes sense for that). Then you can have an And struct that only passes Satisfies if the two Nodes it contains pass Satisfies.

Not sure if that makes any sense or seems helpful at all.

Comment thread spdxexp/parse.go

func (t *tokenStream) parseAtom() *Node {
parenNode := t.parseParenthesizedExpression()
if t.err != nil {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm reading this and the previous function as (MIT (no closing paren) will just silently disappear from the expression rather than bubbling up an error. Is that right? It seems like that could cause trouble.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll add a follow-on PR to add a test.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

danielramn18gmail.com

Comment thread spdxexp/parse.go
// Advances the index if the operator is found.
func (t *tokenStream) parseOperator(operator string) *string {
token := t.peek()
if token.role == OPERATOR_TOKEN && token.value == operator {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are these operators case sensitive? Is MIT or GPL-2.0 allowed?

Copy link
Copy Markdown
Collaborator Author

@elrayle elrayle Sep 5, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ABNF in the spec defines them as all cap. The javascript parser expects all caps. In the javascript, the results of parsing converted them to all lower case. Not sure why they chose to do that. I did the same, expecting that as I moved farther into the code, it would become clear why this was done. But now that a significant junk of the satisfies code is done, I don't see a need to convert them. For now, I will leave the conversion to lower case in, but this would be a very simple refactor to remove an unnecessary conversion.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment thread spdxexp/scan.go
@@ -0,0 +1,293 @@
package spdxexp

/* Translation to GO from javascript code: https://github.com/clearlydefined/spdx-expression-parse.js/blob/master/scan.js */
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: The O in Go is lower case 😃

Copy link
Copy Markdown
Collaborator Author

@elrayle elrayle Sep 5, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm going to give each Go file better documentation. I'll update this when I do that.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment thread spdxexp/scan_test.go
{role: LICENSE_TOKEN, value: "Apache-2.0"},
{role: OPERATOR_TOKEN, value: ")"},
}, nil},
{"kitchen sink", " (MIT AND Apache-1.0+) OR DocumentRef-spdx-tool-1.2:LicenseRef-MIT-Style-2 OR (GPL-2.0 WITH Bison-exception-2.2)",
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🙀

@elrayle elrayle merged commit b472351 into main Sep 5, 2022
@elrayle elrayle deleted the elr/scan-and-parse branch September 5, 2022 12:22
@elrayle
Copy link
Copy Markdown
Collaborator Author

elrayle commented Sep 5, 2022

Thanks for the suggestions about another approach for the structuring of nodes. Once the translation is complete, I can take a deeper look at potential refactors that can simplify the code and make it more closely follow typical Go code.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants