Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix conflicting dependencies between upstream JavaModules #2735

Merged
merged 27 commits into from
Sep 14, 2023

Conversation

lihaoyi
Copy link
Member

@lihaoyi lihaoyi commented Sep 11, 2023

Fixes #2732

We introduce a new localCompileClasspath, which differs from compileClasspath in that it leaves out the transitiveCompileClasspath from upstream modules and resolvedIvyDeps from third party dependencies.

Also tried to do some cleanup of JavaModule, refactoring out a transitiveModuleCompileModuleDeps helper to DRY things up, re-arranging things so all the fooModuleDeps helpers are all together.

Tested via a new unit test mill.scalalib.HelloWorldTests.multiModuleClasspaths, which fails on master and passes on this PR. This test case asserts the overall "shape" of the classpaths, which should help rule out an entire class of mis-configuration w.r.t. how the classpaths are wired up

@@ -424,8 +427,7 @@ trait JavaModule
* necessary to run this module's code after compilation
*/
def runClasspath: T[Seq[PathRef]] = T {
localClasspath() ++
upstreamAssemblyClasspath()
resolvedRunIvyDeps().toSeq ++ transitiveLocalClasspath() ++ localClasspath()
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We remove upstreamAssemblyClasspath so we can remove the dependency on unmanagedClasspath, since it now also comes from localClasspath, and we want to avoid duplicate classpath entries

*/
def localClasspath: T[Seq[PathRef]] = T {
compileResources() ++ resources() ++ Agg(compile().classes)
localCompileClasspath().toSeq ++ resources() ++ Agg(compile().classes)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change adds unmanagedClasspath to localClasspath. This is probably the right thing to do, since it means it'll probably be aggregated from upstream modules into upstreamAssemblyClasspath, where it wouldn't before

@lihaoyi lihaoyi force-pushed the fix-conflicting-dependencies branch from 5484266 to 4d03f0b Compare September 12, 2023 07:38
Comment on lines 1522 to 1523
// We aggregate upstream module `compileIvyDeps` during compilation
"com/lihaoyi/sourcecode_2.13/0.2.2/sourcecode_2.13-0.2.2.jar",
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if this is correct: should compileIvyDeps be transitive at all? But it is the current behavior

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

compileIvyDeps should not be transitive. They only are for local compilation. Downstream consumers need to provide them themselves if needed.

Copy link
Member Author

@lihaoyi lihaoyi Sep 12, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It turns out that I was mistaken, and this was a normal ivyDeps but in a compileModuleDeps.

@lefou it seems you left a comment in here stating that this is necessary. Do you remember the reasoning?

https://github.com/com-lihaoyi/mill/pull/955/files#diff-85ad27993fcc36e23f993baae04e41cb1bfa900502fa0622197d964abca08ddcR103

I'm finding that without propagating ivyDeps from the compileModuleDeps, a whole bunch of Mill's build stops compiling because e.g.

  1. A ends up compiling against module B with ivy dependency C on it's classpath
  2. But if B has a method that returns a type from C, then A will cause the compiler to crash since it needs to have C on the classpath to look up and compile against

It seems to me that this should apply equally for ivyDeps and compileIvyDeps. Currently, a module's compileClasspath only picks up ivyDeps from it's compileModuleDeps, but does not pick up it's compileIvyDeps. This has likely not been a problem so far only because compileIvyDeps is so uncommonly used. This odd edge case in the behavior is evident in the following part of the test case

https://github.com/com-lihaoyi/mill/pull/2735/files#diff-c7649e8d164f66c3708dc32af879350440fab6bf7419d636de0dae55680b4949R1521-R1526

Given that ivyDeps and compileIvyDeps have the same problem, should they be inherited by compileClasspath in the same way?

Copy link
Member

@lefou lefou Sep 12, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ha, I asked myself the same question, when I tried to fix this issue. Unfortunately I don't remember exactly. The timing for me to deep dive into this issue is a bit unlucky. I hope I can later review and think about it. (I also pushed my WIP from last night, just to see the CI results.)

I think we definitely want the handle compileIvyDeps and compileModuleDeps equally, as the latter is converted to the former once published, and Mill should behave the same, disregarding the compile-time-dependency comes from the local build or from a coursier dependency.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure, this isn't super urgent, so I'll leave it open until you have a chance to think about it

Copy link
Member Author

@lihaoyi lihaoyi Sep 12, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think feel like the best thing to do here is to have all compile dependencies and run dependencies be transitive, but limited to their respective classpaths.

Otherwise, there'll be no end to problems where ModuleA compiling against an upstream ModuleB fails due to ModuleB compiling against ModuleC and exposing it in a type signature. Similarly, for run dependencies, it makes sense that an application that needs e.g. log4j at runtime will continue to need log4j at runtime if used as a library for another application.

It's possible to work around these issues in user land, but it would basically always come down to cherry picking arbitrary dependencies from your upstream modules until things work, or manually traversing the module graph and collecting everything. The former is fragile and the latter is something that it feels like we should automate.

This would make compile{Ivy,Module}Deps diverge from how Maven treats provided scopes (those are non-transitive), but it seems like a better way of doing things

Copy link
Member

@lefou lefou Sep 13, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are various out-side givens to incorporate. E.g. we consume dependencies via coursier, so we can't change how transitivity is handled by it, hence, we are forced to adapt our inter-module dependencies handling likewise.

Also, Maven-like provided dependencies are useful for compile-time-only APIs, like macros (like acyclic?), which should not leak transitively. Or to ensure to build against some older baseline APIs. You don't want to force those old versions transitively, but you want to force your local compilation to use that old version to ensure, you don't accidentally use newer stuff.

Copy link
Member Author

@lihaoyi lihaoyi Sep 13, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. We can't change the maven pom scopes, and those have no way to express "transitive compile time dependency". That means that no matter what we do with moduleDeps, there's no way to have a transitive compile time dependency published to maven central

I guess in that case, it sounds like we follow the current status quo, where the compileClasspath picks up the transitive *Deps of compile*Deps, but not the transitive compile*Deps.

@lihaoyi
Copy link
Member Author

lihaoyi commented Sep 14, 2023

I'm going to merge this as is for now: with compileIvyDeps and compileModuleDeps not transitive, but ivyDeps and moduleDeps being transitive, even if depended on from a compileModuleDeps, though in that case only from the compileClasspath.

That should mostly match how provided works in SBT: adding a provided dependency in SBT pulls in all its transitive non-provided dependencies as well. One difference is that this applies to the fullClasspath, because SBT doesn't have a distinction between compile and run classpaths. This should also mostly match the current behavior, at least before the regression described in #2732

Copy link
Member

@lefou lefou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks reasonable to me. It's a bit hard to review in the browser, due to the various other (unrelated) refactings.

I found one potential discrepancy between transitiveCompileClasspath and bspTransitiveCompileClasspath. To former now used localCompileClasspath whereas the latter still uses bspCompileClasspath which should be in sync with compileClasspath.
But if I look closer, bspCompileClasspath is also using localCompileClasspath. It could be right, but hard to tell in the browser.

Also, this PR changes the order of dependencies and puts all elements from ivy sources before others. This is a behavioral change. Was this intentional?

@lefou
Copy link
Member

lefou commented Sep 14, 2023

Just for completeness, I'd like to mention, there is also PR #2736, which also fixes issue #2732 and is more minimal. Just in case we later find this approach too invasive in behavior.

@lihaoyi
Copy link
Member Author

lihaoyi commented Sep 14, 2023

Also, this PR changes the order of dependencies and puts all elements from ivy sources before others. This is a behavioral change. Was this intentional?

Somewhat. The status quo was broken, so of course we'd need to change things, so I while changing things I thought we might as well put things in a consistent order. With this PR, the order is always ivy deps, then upstream modules, then the current module. IIRC the previous orderings weren't as consistent between the various *ClassPaths

@lefou
Copy link
Member

lefou commented Sep 14, 2023

Also, this PR changes the order of dependencies and puts all elements from ivy sources before others. This is a behavioral change. Was this intentional?

Somewhat. The status quo was broken, so of course we'd need to change things, so I while changing things I thought we might as well put things in a consistent order. With this PR, the order is always ivy deps, then upstream modules, then the current module. IIRC the previous orderings weren't as consistent between the various *ClassPaths

Hm, they weren't in order since Mill 0.11.0-M8. Before, we always had local dependencies first. Tbh, I don't care so much, but I have a vague memory of some issue in a customers project, when we changed the order in the generated IDEA projects. We then reverted/reworked the generator and made it consistent with Mill CLI. I have no pointers right now. (Could be something with multiple application.conf, resolved from the classpath, and one upstream dependency also provided one, so the application was never able to read its own local file. I'm only guessing, could also have been some other cause.)

@lihaoyi
Copy link
Member Author

lihaoyi commented Sep 14, 2023

Classpath ordering is definitely a risk of breakage, it happens when a user has multiple different files at the same path on the classpath in different jars or classfile folders.

But I don't think that's something we should guarantee too strongly. Even listing things on the filesystem can vary in order, so if a user deploys jars in a folder and does java -cp * classpath ordering can break things if deployed on different machines or even different folders on the same machine (we've seen bugs like this before at work). So in some ways code that depends on classpath ordering is already broken

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Mill doesn't evict older versions of transitive dependencies from moduleDeps
2 participants