Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Relicense under another license #466

Open
KOLANICH opened this issue Aug 11, 2018 · 25 comments
Open

Relicense under another license #466

KOLANICH opened this issue Aug 11, 2018 · 25 comments
Labels

Comments

@KOLANICH
Copy link

I have nearly finished the part of kaitaiStructCompile.py using JPype, but found there are licensing issues.
The problem is that the relevant part of the lib taps into ksc. I have thought about some ways to avoid this like moving the module linking the lib into a separate-installable plugin discovered via setuptools entry_points and choosen in runtime, while the alternative plugin has Unlicense license, but according to FSF faq (though it may have a biased view, but I'm not a lawer and obviously don't want to hire a one) this means that kaitaiStructCompile. is also required to be licensed under GPL.

If the main program uses fork and exec to invoke plug-ins, and they establish intimate communication by sharing complex data structures, or shipping complex data structures back and forth, that can make them one single combined program.

If the main program dynamically links plug-ins, and they make function calls to each other and share data structures, we believe they form a single combined program, which must be treated as an extension of both the main program and the plug-ins.

If they [a GPL plug-in for a nonfree program] form a single combined program this means that combination of the GPL-covered plug-in with the nonfree main program would violate the GPL.

Lurking for some clarification, found a stack(overflow|exchange) post (unfortunately I have closed it and couldn't refind) having a checklist and describing that from legal PoV (even though the post was "not a legal advise") it doesn't matter at all if one's program links against the GPLed lib, or includes it, if the program is nonfunctional without that lib it is a derivative work, and surely this applies to kaitaiStructCompile.py since there is no another Kaitai Struct. One may argue that this is nonsense since an iOS app is non-functional without apple proprietary products too, but I'm not a lawer.

So it seems there is no legal way to circumvent GPL here other than recreate KSC from scratch, and I guess that creating a dummy implementation having the same interface and doing nothing is not the solution since (again I'm not a lawer) from the court point of view it may look like the whole point of doing this is copyright violation.

Since this kaitaiStructCompile.py is itself a plugin to setuptools, which is meant to be explicitly enabled by its users, this means that their setup.py is also have to be GPLed, which is obviously unsuitable.

So, I guess we need to change the license of the compiler, and lgpl is not an option because despite linking the tool is still non-functional without ksc, so the same arguments hit as in gpl.

@GreyCat
Copy link
Member

GreyCat commented Aug 12, 2018

Thanks for sharing this.

The very basic idea behind using GPL in the tooling is actually to keep tooling free/libre, i.e. available for everyone to use and contribute to. It's not LGPL exactly because the original intent was to keep tooling (which might make use of inner APIs of the compiler, i.e. IDEs, visualizers, build system plugins, etc), free, while keeping results of compilation under a license that would allow usage in closed-source products as well.

I agree that your use case is kind of borderline use here. I wonder what exactly do you use using JPype that you can't do using normal CLI API? If something's missing, may be we could export that using CLI API and/or compilation results?

Alternatively, if you can/want, you can relicense kaitaiStructCompile.py under GPL, either whole thing, or a separate subproject that uses JPype only...

@KOLANICH
Copy link
Author

KOLANICH commented Aug 12, 2018

The very basic idea behind using GPL in the tooling is actually to keep tooling free/libre, i.e. available for everyone to use and contribute to.

Don't permissive licenses do it? The idea of GPL is not about freedom, it is about total war with everyone not using GPL or other licenses blessed in it.

I wonder what exactly do you use using JPype that you can't do using normal CLI API?

In fact for now I don't do or plan to do anything which cannot be done using CLI API, either immediately, or with some modifications to KSC (which may be covered by FSF vision of GPL violation).

If something's missing, may be we could export that using CLI API and/or compilation results?

1 fileless workflow. Now when using the cli the compiler dumps compiled files. Since the tool postprocesses them it is more preferable not to dump them on disk but to keep in memory.
2 Exceptions.
3 command line arguments parsing is security nightmare. I prefer pass the params not via command line arguments, by via IPC in structured way.
4 progress

Unfortunately, all of these satisfies establish intimate communication by sharing complex data structures of FSF interpretation fires. I'm afraid that even --ksc-json-output falls under this.

Alternatively, if you can/want, you can relicense kaitaiStructCompile.py under GPL, either whole thing, or a separate subproject that uses JPype only...

Do you mean

__init__ (Unlicense) -> compiler.cli (Unlicense) =dumb cli without much integration=> ksc (GPL)
__init__ (Unlicense) -> compiler.jpype (GPL) -> ksc (GPL)

?

That was what I was trying to do (I have described this in head message) first, but I'm afraid this won't do the trick and just distributing the second variant is illegal according to FSF vision (I guess proprietary in their wording can be replaced with having any other license) and requires the whole module (including compiler.cli, because from that point the tool would definitely match FSF creteria for a combined program) be licensed under GPL, which is plainly unsuitable because of its virality.

There are GPL exceptions, but they must be approved by FSF, otherwise they prohibit to call the license GPL (the license text except the preambule is permitted to be used in other licenses).

@GreyCat
Copy link
Member

GreyCat commented Aug 12, 2018

Don't permissive licenses do it? The idea of GPL is not about freedom

That's a deeply philosophical question, and I feel that we don't agree on this one with you. I'm not a huge fan of GPL myself, but in this particular case, it protects us from proprietary code forks which might easily get more popular due to large amount of resources a commercial company might put into marketing.

Currently, whole reverse engineering / low-level tooling world (with smaller companies like Hex-Rays, Vector 35, Sweetscape, and bigger fish like Intel, IBM, etc) is heavily dominated with proprietary software. I totally don't want to see any of them getting "Kaitai Struct" and all the work that went into it, integrating it into their closed-source software and start marketing this functionality as part of their own, especially if they will do some non-compatible modifications. GPL would allow them to integrate using ksc CLI calls, but they would be either forced to open source their fork (=> allows people to learn what was the original and come to us), or to not fork anything and stick to original (=> even better, people could just learn and come to us) => i.e. it's a win-win situation.

1 fileless workflow. Now when using the cli the compiler dumps compiled files. Since the tool postprocesses them it is more preferable not to dump them on disk but to keep in memory.

Actually, ksc for JS already does that. I can introduce the same using an option akin to --ksc-json-output (--ksc-json-targets?), which would dump generated files into output JSON in structured form.

2 Exceptions.

Um, what do you mean and what do you want to do with exceptions? --ksc-exceptions allows one to dump raw exceptions.

3 command line arguments parsing is security nightmare.

Right. We can try passing RuntimeConfig using JSON input file or something — i.e. --config /tmp/ksc.config.json?

4 progress

If you mean "compilation progress", then it's actually relatively hard to implement and I don't see much point, as compiler is relatively fast: compiling our whole test suite (148 files) takes ~4.5s. Compilers like javac or gcc work at similar pace and don't give much progress report beyond file being compiled (and even that is not always visible).

Do you mean

The very straightforward way is to just have 2 different projects: one is Unlicensed using CLI, and another is GPLed using JPype.

A somewhat less straightforward way would be to have 3 projects: two with inits (Unlicense+GPL), and third one with common code (Unlicensed, as you prefer it). Both GPLed and Unlicense can use your shared unlicensed code, if you want to avoid code duplication.

@KOLANICH
Copy link
Author

KOLANICH commented Aug 23, 2018

Just realized that I have typed the answer but have forgotten to post it.

I can introduce the same using an option akin to --ksc-json-output (--ksc-json-targets?), which would dump generated files into output JSON in structured form.

I guess it's better to output a zero compression zip archive into stdout. Serializing into json is damn inefficient because of escaping. BSON/msgpack kay be a good choice, but zip archives are already supported by far more software.

Um, what do you mean and what do you want to do with exceptions? --ksc-exceptions allows one to dump raw exceptions.

We may serialize them into json.

We can try passing RuntimeConfig using JSON input file or something — i.e. --config /tmp/ksc.config.json?

Using stdin for that is better.

compiling our whole test suite (148 files) takes ~4.5s.

Depends on hardware. For PIII it took about 2 seconds to compile 2 files (one of them is mifare_classic.ksy). I wonder how much of that is taken by JRE initialization and how much by actual compilation.

The very straightforward way is to just have 2 different projects: one is Unlicensed using CLI, and another is GPLed using JPype.

A somewhat less straightforward way would be to have 3 projects: two with inits (Unlicense+GPL), and third one with common code (Unlicensed, as you prefer it). Both GPLed and Unlicense can use your shared unlicensed code, if you want to avoid code duplication.

Not quite, the main concern here is the mentioned FSF interpretation of GPL. It fires even if we use CLI.

@KOLANICH
Copy link
Author

KOLANICH commented Feb 4, 2019

@GreyCat, are you going to solve it somehow? I have almost finished the JVM backend (GraalVM support is added, though it won't work now: GraalPython is too unfinished now), but I still don't want to make people license their setup.py under GPL. I wonder if it is possible to add an exception, allowing linking it to build automation tools of any permissive license. Or for example limit GPL virality with the clauses like in https://github.com/kemitchell/shared-component-license.

@XVicarious
Copy link

@KOLANICH https://tldrlegal.com/license/gnu-general-public-license-v3-(gpl-3)

You CAN call GPL licensed software in non-free software so long as you haven't modified it. The FSF specifically mention compilers and other tools in the FAQ as acceptable usage that doesn't violate the license.

@KOLANICH
Copy link
Author

This is just an example:

If the two programs remain well separated, like the compiler and the kernel, or like an editor and a shell, then you can treat them as two separate programs—but you have to do it properly.

And that case is inapplicable in my case. In my case they are effectively a single program: I use KSC as a lib and this works much better than interacting KSC via a CLI. Unfortunately the license of KSC prevents me from sharing the stuff (sharing the stuff under GPL is not an option because it would mean a requirement to license any programs using the tool and being distrihuted under GPL, which is inacceptable).

@KOLANICH
Copy link
Author

KOLANICH commented Aug 10, 2020

@GreyCat, are you going to do anything about this? If not, please close this issue and put a wontfix label.

BTW, interacting directly with KS using JNI is still the preferred way. It may require some changes in the license. The optians are:

  1. LGPL
  2. Well-known exceptions to GPL listed in GPLtext
  3. Custom exceptions to GPL (have to be FSF-approved for the license being called and compatible to GPL)
  4. A custom license not called GPL.

May be accompanied with technical measures to separate the core from interface. But it has to be done by the holders or from their explicit public approval.

The second preferred way is a server communicating to external tools using a message queue in a shared memory mapping. This way is not very good, it has some overhead. But the overhead should be smaller than communicating via stdout.
Thentrird preferre

@abitrolly
Copy link

Am I right, that relicense of compiler is not needed for.

  1. CLI
  2. sockets
  3. UDP / TCP
  4. HTTP
  5. gRPC etc.

Why JNI is prefered over the methods above?

@KOLANICH
Copy link
Author

KOLANICH commented Sep 11, 2020

As
1 Because JVM initialization is the major time-eater. Because passing args via CLI is insecure. Because fusing 2 processes into one (python + JVM) is the most efficient (except using GraalVM, which reimplements python upon JVM with all its performance benefits from JIT-compilation) way to do that.

All the solutions with a local service have an additional drawback - one has to deal with service starting, restarting and dying. All the solutions using network has an additional issue - one has to deal with network issues.

2 sockets and pipes and everything kernel-mediated stream-based is inefficient. A better approach is a pair of shared memory pages + a message queue using them.
3. likely even slower and is more complex.
4. even more slower. If one wants it, it can implement it on top of other impl.
5. haven't used it, but IMHO depending on protobuf is not a very good idea. It always cause a bit of pain to install.

@KOLANICH KOLANICH reopened this Sep 11, 2020
@abitrolly
Copy link

I see the obsession about performance in every argument. Which is strange in the case of Kaitai compiler looks like premature optimization anti-pattern to me. It only affects compilation. It is only run when .ksy files are changed. People don't need to run in in production. Only for development. What is your expected speed up for this change?

Shared memory + message queue is intra-process, unless your provide a service, which us again HTTP / TCP / UDP etc.to pass the credentials needed to access that.

@KOLANICH
Copy link
Author

KOLANICH commented Sep 12, 2020

I see the obsession about performance in every argument.

It is not obession, it is necessity.

Which is strange in the case of Kaitai compiler looks like premature optimization anti-pattern to me.

JVM backend gives a few seconds noticeable benefit, when used for bulk processing of ksys.

It only affects compilation. It is only run when .ksy files are changed.

Which makes the issue especially important because when we design a spec, we have to compile it a lot of times. Few seconds * 1000 ≈ few hours.

Shared memory + message queue is intra-process

A memory page (and its corresponding page frame) can be shared by multiple processes.

@abitrolly
Copy link

I see not real numbers here. Can you address non-answered question - What is your expected speed up for this change? And more specifically these questions.

  1. What is the current timing for the operation you want to speedup?
  2. Why are you doing this operation?
  3. What is your optimization target (which timing you need)?

@abitrolly
Copy link

A memory page (and its corresponding page frame) can be shared by multiple processes.

You didn't answer how you pass the handle to this memory to another process. This will be the performance bottleneck.

@KOLANICH
Copy link
Author

I see not real numbers here.

I have not measured. Just experience.

What is the current timing for the operation you want to speedup?

Loading everything needed into memory. Have not measured the time, know only that the difference is noticeable and in batch processing takes seconds.

Why are you doing this operation?

a. initialization is probably unavoidable. a solution may be to compile KSC into native code ahead of time.
b. another solution to reduce impact may be to bulk process in a single call. To do it we need a way to supply different cli args to different files. Have not tested if it is the case. Also it doesn;t fit nicely with the API provided by kaitaiStructCompile.py, may need redesign into 2 stages - make task and execute it.
c. >What is your optimization target (which timing you need)?

As fast as possible. Should work fast even on Pentium 3 machines.

You didn't answer how you pass the handle to this memory to another process. This will be the performance bottleneck.

The handle can initially be passed using stdin.

@abitrolly
Copy link

I have not measured. Just experience.

The argument of a programmer, but not of a software engineer.

@abitrolly
Copy link

The handle can initially be passed using stdin.

Running Kaitai as an HTTP server that accepts .ksy files will be more beneficial and even faster for much more developers than JNI + queue + CLI communication interface. Because the cost of JNI + queue + CLI client implementation is enormous compared to GET requests supported by every major programming languages.

@KOLANICH
Copy link
Author

KOLANICH commented Sep 14, 2020

The argument of a programmer, but not of a software engineer.

Measuring it is an additional effort. Should be done definitely. For my purposes it is enough that I perceive them as faster.

Running Kaitai as an HTTP server that accepts .ksy files will be more beneficial and even faster for much more developers than JNI + queue + CLI communication interface.

Are you personally gonna to implement it and merge into the upstream? If yes, I'm happy to add a backend to kaitaiStructCompile supporting this mechanism. But please note - if you personally implemented it, it'd be a GPL violation (in FSF and Mercurial understanding of the matter). You likely need every contributor blessing to make it not a violation.

Because the cost of JNI + queue + CLI client implementation is enormous compared to GET requests supported by every major programming languages.

ZeroMQ yas something similar, but for intra-process communication between threads. I wonder if it works for different processes too. Have not tested though, have no time for that currently.

@abitrolly
Copy link

Non-GPL client sending API request to HTTP server is not a GPL violation.

@dgelessus
Copy link
Contributor

But please note - if you personally implemented it, it'd be a GPL violation (in FSF and Mercurial understanding of the matter).

What "understanding" are you referring to here exactly? Mercurial actually has a "command server" mode that provides an API over a socket/pipe, and one of the main reasons why it was added is that it doesn't count as "combining" in the context of the GPL, so that other programs can use Mercurial through the command server without any GPL problems.

The suggested solution is to add an HTTP server mode to ksc, which (presumably) would take high-level requests of the form "compile the following spec file(s)" and return the compiled output. This is relatively similar to Mercurial's command server mode, and almost certainly doesn't count as "intimate" communication as the FSF calls it, so it won't count as "combining" under the GPL.

@KOLANICH
Copy link
Author

KOLANICH commented Sep 14, 2020

What "understanding" are you referring to here exactly?

https://www.gnu.org/licenses/gpl-faq.en.html#GPLPlugins

If the main program uses fork and exec to invoke plug-ins, and they establish intimate communication by sharing complex data structures, or shipping complex data structures back and forth, that can make them one single combined program.

https://www.mercurial-scm.org/wiki/CommandServer#Licensing

However, if you modify Mercurial to export new functionality via the command server, that adds obligations for you under GPL.

@KOLANICH
Copy link
Author

KOLANICH commented Sep 14, 2020

My not-a-lawer interpretation:

  1. Copyright holders are in legal power to license their work as they wish.
  2. GPL is designed to be viral in order to spread GPL. FSF interpretation of GPL is also designed to give GPL as much virality as FSF considers necsssary.
  3. Ones using GPL are divided into 2 kinds:
    a) ones who have to use GPL in order to fulfill GPL terms for dependencies
    b) copyright trolls that mean "I am the copyright holder and I decide what you allowed to do with the code written by me"

FSF and Mercurial devs (read https://www.mercurial-scm.org/pipermail/mercurial/2011-March/037593.html mentioned in Mercurial license FAQ, it makes the main dev position very clear) clearly belong to the kind of copyright trolls. So do (under the necessary effort clean-room or black-box reverse engineering of git is meant) some Git developers.

@dgelessus
Copy link
Contributor

If the main program uses fork and exec to invoke plug-ins, and they establish intimate communication by sharing complex data structures, or shipping complex data structures back and forth, that can make them one single combined program.

None of this applies to the hypothetical ksc HTTP interface, as I already said above.

However, if you modify Mercurial to export new functionality via the command server, that adds obligations for you under GPL.

I assume that the ksc HTTP interface won't allow callers to "export new functionality" (i. e. to load extra code into ksc or to implement complex callbacks), so this is also irrelevant here.

@KOLANICH
Copy link
Author

None of this applies to the hypothetical ksc HTTP interface, as I already said above.

which (presumably) would take high-level requests of the form "compile the following spec file(s)" and return the compiled output.

How would it take multi-file (importing not modules in the stdlib) requests and return structured info about exceptions without what can be considered by copyright trolls as sharing complex data structures, or shipping complex data structures back and forth? Even a zip archive or a CBOR document used in REST API can be considered a "complex structure" for the purpose of copyright trolling. The only way having any chances to satisfy the definition by the trolls is to avoid structures that can be called by copyright trolls complex is to limit ourselves to very dumb and insufficient interface.

I assume that the ksc HTTP interface won't allow callers to "export new functionality" (i. e. to load extra code into ksc or to implement complex callbacks), so this is also irrelevant here.

If I understand right, the quote means not plugins, but forking Mercurial, implementing some API in the server subsystem in order to expose some functionality via the API, and then using this API in own code. So I understand that the quote as equivalent of the statement "only some blessed authors of Mercurial decide if they consider that some API exposed via the server is viral or not, and by default they consider (and authorize orgs like SFLC and SFC to consider) all the API exposed that way as viral, but for all the API present in the upstream they have already considered tat that API are not".

@Thrameos
Copy link

Thrameos commented Oct 8, 2020

My general feeling (I am not a lawyer) is that the GPL is intended to prevent tight integration allowing someone to provide an interface that presents a GPL code as proprietary. What is the difference between compiling against a static library, a dynamic library, or creating a IPC service that allow you to preform all the same functionality of making use of a GPL product? If they all provide the same functionality (with the only difference being technology under the hood), then they would all be equally violating of the GPL license. Some would argue that the IPC gets around it because the IPC code that is exporting the GPL product is GPL and thus the derived work is not the full code that uses the IPC, but that really is a dodge of the intent of GPL.

Consider from https://www.gnu.org/licenses/old-licenses/gpl-2.0-faq.html#MereAggregation

By contrast, pipes, sockets and command-line arguments are communication mechanisms normally used between two separate programs. So when they are used for communication, the modules normally are separate programs. But if the semantics of the communication are intimate enough, exchanging complex internal data structures, that too could be a basis to consider the two parts as combined into a larger program.

This means that regardless of whether you do direct linking, use pipes or sockets, or even skip the direct communication and use JSON communication with serialized objects passed through a CLI (exchanging complex internal data structures via files), if the communication makes the two products an integral piece of the same code then it is considered "linking" which is prohibited. They list anything that ends up in the same memory address as definitely linking, but that was not meant to limit the larger extent.

Thus JNI is vs HTTP is really immaterial as they are just different communication protocols (though as JNI is usually in the same memory space so it hits the linked clause more clearly). I can in principle make JNI communicate through ZeroMQ and google protocol buffers, and the user of the library can't tell the difference would that some how make it okay which still using JNI? If @KOLANICH chose Py4J which uses socket IPC rather than JPype which uses direct JNI but both allow the exact some level of class, field, and method communication to a Java library would that some how not violate the spirit of the GPL? If it is distinction without a difference (both provide full access to Python), then I would say they both violate.

If you wish to allow anything above the system exception then do so explicitly as per the instructions.

https://www.gnu.org/licenses/gpl-faq.en.html#GPLIncompatibleLibs

Simply add an exception to your version stating explicitly what you want it to be able to do. If you don't mind communicating with other open source codes (even with incompatible licenses) so long as code is not being distributed as part of proprietary license, you can add that exception if you are the author. If you just want to propagate the GPL viral license that is also your choice as an author.

Again my unsolicited opinion, just making it difficult to use your software forcing pipes or sockets or other technological implementation (CLI/serialized objects) just raises the burden of using the library and ultimately invites some other group to replicate your work to avoid the restriction which fragments the community and reduces the value of your software to the world. I recently replicated the work of some other library simply because the authors maintenance cycle was interfering with my own project and his license choice precluded inclusion in an Apache project. Ultimately I am not sure if that helps or hurts the community but is was the only thing I could do to stop people from being directed to install an old buggy version which was clearly a negative.

If this library is designed to allow intimate communications via serialized objects then it should just add an exception that allows the same level of communications directly with the restrictions that you wish to impose. That would make your intent clear rather than depending on the vagueness of the generic GPL license. That said relicensing is a bit extreme as adding a simple exception clause would do.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

6 participants