-
Notifications
You must be signed in to change notification settings - Fork 278
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Multiple scala version builds within the same bazel run #962
Comments
|
This is a great issue! I won't have guaranteed time to read through and respond to this until the end of this week or early next week. |
My first goal is always to help the community: if pushing for toolchains doesn't help actual users of rules_scala, it's the wrong approach and I won't do it. I want to help you write the most useful rules you can. One approach we initially used internally for Java rules was to embed language/runtime level as constraints and create platforms that contain those. This has turned out to be the wrong choice: there are too many overlapping platforms and it's too confusing, so I'm working on how to migrate the system to a place where there are separate language and runtime flags. I think that will happen is that the This is pretty long-winded, and I don't know if it's useful to rules_scala. My initial thought would be that a split transition would be the right way to have a single target that generates multiple versions, but I don't know how that would work through the process of depending on libraries. Unfortunately I don't follow enough through this issue (or through PR #964) to understand exactly what you're proposing to implement. I do want to support the choices that you are making as rules authors and make it as easy as possible. |
Typing as we speak :-) |
Re rules_jvm_external- are you planning to make it part of the contract or an implementation choice of rules_scala? Right now rules_scala uses bind (and I’d like us to move to toolchains) to allow indirection and independence for users to specify the needed dependencies. We (Wix) have our own mechanism which has significant benefits compared to bazel-deps and rules_jvm_external for our use case. We need to be able to keep using it. The main reason I’m concerned is that as opposed to “regular” features this one is looking at the core of dependencies and rules_scala. I’ll take a look later at your PR but if you can comment that could really help quell my concerns. |
Thanks, @katre. I for one (and I'm sure there are others) appreciate the help. I know there are significant gaps still in my understanding of how things fit together best. If you have time to comment: There's more description of the user-level use case in README AFAICT, for our use case we need to create what are in essence parallel subgraphs. At this point I think we need to use macros to wrap rules/targets, e.g.,
Not sure if this is what you're thinking, but in the in-progress POC, I look at arguments to the macro and use that to synthesize the label for the toolchains to pass to the rule to create each target. I'm okay with this though I don't like the fact that it is an end-around toolchain resolution. It's based on label string manipulation of toolchains which avoids the type-based resolution in toolchains. One alternative it to create toolchain types for each version and then create different rules for every toolchain type. This is still label string manipulation, but it's at the type level and we don't expect/support changing toolchain types. That would be less restrictive than not letting users define their own toolchains of the standard toolchain types. It requires different toolchain types for each version but could be resolved with toolchain resolution. Even without version-specific toolchains, we can support user-defined toolchains, we just can't provide multiscala support out of the box in that case. It's a limitation but it's a corner-case. Users could still make things work, it would just require more work on their part, e.g., adding their own macros. I think this falls under "make easy things simple and hard things possible." I'm pretty sure that both (single toolchain types and version-specific toolchain types) approaches will work but I'm not sure which feels more in the spirit of the design. In some ways multiple toolchain types, e.g., per-version, scalac-2.11 type, scalac-2.12 type, feels wrong, that that's what a scalac-type should represent: in almost every case, there's doing to be a single toolchain for each version. On the other hand, while it's not terribly common, there are different scalac compilers for the same version and so in the corner case, there are potentially multiple implementations of a scalac-2.11 type. On the good side, since all this is behind a supplied macro, we could change the approach without users needing to change build files. I've tried to work out how transitions could be used and I haven't gotten too far ... but my intuition about transitions is pretty weak. Since we need
IIUC, I looked at this to start and didn't make much progress, but again, we're at the edge of my understanding when we talk about transitions. This is partially because I don't think we want a target that depends out to multiple versions, we want parallel disconnected subgraphs. They only "meet" by target wildcard, e,g., command line ":all"s. I think I win the long-winded contest. I appreciate anyone reading through all of this. I wouldn't be surprised if it's confusing, so let me know if I can answer. But also don't feel compelled to school me ... Cheers! |
I mainly added that text to spur this discussion :-) I figure I don't really understand the space of use cases. I need to learn (though it feels like a big space ...)
It's absolutely not a blocker. We can make things work. It takes more effort, which is fine. I just wanted to be sure it was a requirement. Now I know. :-) I use
This part of the WIP is not meant to be definitive.
I think we definitely need to get rid of But that does just kick the can down the road: how does the toolchain depend on the jars? This is a subcategory of the overall dependence issue, i.e.,
These don't necessarily have to be the same. I can imagine that the latter is harder to constrain that the former. Do you all need to control how the toolchain-rooted dependencies are loaded? I'm just wondering. I suspect if we need to support multiple loading mechanisms for user code, we might as well for compiler code, too. The core issue is how do we enable a macro to do version-specific dependency label manipulation, e.g., how do I depend on
I have two observations, but this remains a WIP:
An alternative to the second choice is to use |
I fleshed out more WIP thoughts on repositories in https://github.com/bazelbuild/rules_scala/blob/ccb535e21bec3fd26e1d83a718b1544f06057234/unstable/multiscala/ExternalRepositories.md |
Hey @smparkes . Or from Wix trying to catch up on the amazing work you did here. So I just did a very shallow reading and I also cloned your branch and looked at the diffs. I hope I understood sufficiently. In general - I understand why you want to remove the I see all your points regarding on how to translate external scala deps to the right version. I need to keep thinking about it... I had a lot of naive thoughts but it seems like your insights are more advanced at this point. Also - I wonder how this would behave in the IDE (with single / multiple scala versions). Will do some research and hopefully post some more questions / insights throughout the weekend. Cheers! 🌷 |
👋
Agreed. I actually started working on an independent PR to factor out/cleanup the stuff related to toolchains. I can add the binding into that easily. I'll have a draft of that this morning.
I haven't explored this. It's a good question. We use IJ but I haven't looked at how IJ hooks into bazel to know what this affects.
Thanks! |
FWIW, ran into this today. This is a little bit challenging for multiscala but I'm pretty sure it's feasible. I understand the concept of I'm not sure how the plugin works. I assume some combination of parsing build files and using the bazel query command ... |
There are, basically, two approaches here. One is to use macros (or clever rules) to take the single target graph and split it into the parallel subgraphs, one per version. The other option is to use starlark configuration with a split transition: scala_library( scala_library(name = "dep_a", ...) With the split transition, "top", "dep_a" (and other dependencies) can be built multiple times, once per scala_version, with the graphs automatically fanning out as needed. This can be confusing with queries, however, since the targets will be shown multiple times with different configs. Also, this will cause deeper non-scala targets to be duplicated (although actions will still be cached), until we can get configuration trimming to work. Ultimately, which approach to take depends on the needs of your users. We're happy to consult and assist whichever way you want to go. |
Thanks for the comments.
The other option is to use starlark configuration with a split transition:
scala_library(
name = "top",
scala_versions = [ "1.11", "1.12", "2.0", ],
srcs = ...
deps = [ ":dep_a", ...],
)
scala_library(name = "dep_a", ...)
With the split transition, "top", "dep_a" (and other dependencies) can be
built multiple times, once per scala_version, with the graphs automatically
fanning out as needed.
I had looked a bit at this but got blocked ... but then I haven't used
transitions so I may not totally understand. We are talking User Defined
Transitions
<https://docs.bazel.build/versions/master/skylark/config.html#user-defined-transitions>
? If so:
With respect to "top", this is an incoming edge transition so 1:1? And not
configurable from the scala_versions attr?
With respect to both, if you use transitions here, how are the targets
named? Is that fully implemented? I saw some writing about issues around
output path naming and build settings but that may have been resolved or I
may just not have understood it ...
Appreciate the help!
|
You can do self-transitions as well as dependency transitions. So it looks like: top (target config) |
The targets will keep the same name, but outputs end up in separate output roots under blaze-out, one per configuration. |
Hi @smparkes, |
Understood. Thank you. |
I'm pretty interested in seeing how a solution using transitions would work. It seems like it'd be a cutting edge solution, and consequently something worth exploring but not necessarily guaranteed mergeable to rules_scala. That being said, I'd be happy to review these efforts and follow along during development. For context, supporting multiple versions of Scala is not a priority for Stripe. It is a nice-to-have as it would potentially help future Scala migrations. I'd be participating out of my own interests simply because I believe this feature makes rules_scala more appealing for building large Scala projects. |
cc @katre and others: My original idea on how the toolchains attr worked was wrong. Followed up in https://groups.google.com/d/msg/bazel-discuss/aDABPYoywWs/glGBA5XSAwAJ |
Thanks @smparkes for extensive deep dive into the problem and implementation. I want to share a few items I found essential for us here at Wix:
|
Thank you so much for working on this. Is there some way in which I can be of help without overlapping with existing work? |
At this point, for us, this is on hold. We've had to switch to other efforts for the time being. So for now, I have no pointers. For what it's worth, for the short term, we've gone the direction that others have mentioned of using a symlink to configure the bazel build (primarily We'll revaluate when we get more time. It's unclear to us right now if we can allocate the time to do a clean approach that can accommodate the complexity of all the variations out there and, if we can't, that becomes a blocker to a merge. If we don't think we can make something acceptable to merging and as a result wouldt have to maintain a fork, the approach likely changes because the priority changes from something that is clean to something that is least likely to incur merge conflicts going forward. |
At this point, there's not enough need/support for this (including myself) to continue forward. |
I’m sorry we were unable to converge here and I want to thank you for all your effort. I hope that we will tackle this issue when the bazel infra will be much more friendly to this |
No worries at all. Hope my wording didn't sound negative. Didn't mean it to be. I agree with the outcome at this point in time. |
👍🏽 your wording had a minor effect (if any). It’s mostly my desire to have this rule set be as inviting as possible and when an effort doesn’t converge I feel a bit bad about it. |
PRs
There are a number of PRs in flight. Because they interact, they're sometimes based on one another so a diff against master will show all changes up to that point. That means it's probably easiest to review from in order.
Core PRs: these are cleanup/refactoring PRs for core that are blockers for multiscala
Multiscala PRs: these actually implement multiscala support. They generally require the PRs above.
tl;dr
I'm creating a new issue for this.
Many exist and I'm happy to close this in favor of another but there are (see below) so many with some amount of overlap and various comment history, personally I prefer to start with a clean slate.
I'm actively working on this.
I'd like to get buy-in early that we want to do this and on my current approach. If there is buy-in, I'd like to get feedback as I go. There are lots of choices and I'd love to get feedback/reviews early so
there's not a lot of rework.
If there's not sufficient consensus, I'm happy to close this (we don't need yet another issue) and I'll do the work independently.
Requirements:
Stretch goals:
scala_library
call will, by default, build (suffixed) targets (optional alias for default scala version)Key implementation choices
Dev Approach:
Example:
unstable/multiscala/private/example
Related Issues
It's hoped that these issues are likely to be addressed, completely or in majority, in this work.
#14: Support different versions of scala
#80: Cross building
#147: repeating downloads of scala library
#173: How to make rules_scala use scala-2.10?
#250:
enable_dependency_analyzer
attr is a feature toggle that needs to be removed at an opportune time#252: use new inliner flag in 2.12.3
#263: Reasonable scala
#275: scalatest 3.x.x support?
#302: 'io_bazel_rules_scala/dependency/thrift/scrooge_core' not declared in package 'external'
#393: Project with multiple scala versions
#400: create a scala_test toolchain
#401: create a specs2 toolchain
#402: create a scrooge toolchain
#403: create a scala protobuf toolchain
#432: Insulate toolchain options from breaking rules_scala bootstrap code
#546: scala_maven_import_external and scala_version
#679: Dependency checkers do not detect scala macro dependencies
#703: Default Scala version should be 2.12 (instead of 2.11)
#713: Upstream toolchain fixes mean we can remove some crud around the scalac provider
#768: no matching toolchains found for types @io_bazel_rules_scala//scala_proto:toolchain_type
#809: Support Scala 2.13
#940: Tracking issue- move from deps to toolchains
Status
I've done a good portion of this work once but wasn't happy with the result. Among other things, it wasn't sufficiently "opt-in". I'm planning on taking a more incremental approach and focus on making all the extensions opt-in to start. Also want to get feedback on choices early on.
FAQ
Why not wait/use bazel platforms/toolchains/transisions?
The fundamental toolchain work is entirely usable. There may be some issues with toolchain resolution but they can be worked around.
The bazel concepts that potentially overlap with this are platforms and transitions. There are at least af few broad issues: time, fit and functionality.
Time: That work is not complete and looks to be pretty early, still. It's been on the roadmap for a while but there are lots of open issues. For example, AFAICT, they haven't resolved how file paths are going to be handled. I suspect this work is going to take a fair amount of time and my team can't wait. The current approach should not conflict with any of that work. Migration to that work when it's complete remains to be clarified (next).
Fit: It's not really possible to judge how that work is going to fit this use case since so much is yet to be defined. Most of the relevant work in that effort is around different target platforms but scala versions are not exactly different target platforms: from a bazel perspective, our platform is the jvm, not the scala runtime. I haven't seen any examples extending platforms in bazel that match the scala use case.
Functionality: it's not actually clear how we would use this support for scala versions. Transitions in bazel are applied at the rule level, so it's unclear how we would use them when we generally want to vary the version on the target level. It's not clear how to map the scala pattern of version suffixes in artifact names would work with the work going on in bazel.
rules_jvm_external
?This is obviously debatable. We use
rules_jvm_external
extensively, including artifact pinning. This has been very useful in tracking common dependence versions and noting conflicts. I'm sort of hoping we can standardize on this but I'm not sure what open issues may exist.The text was updated successfully, but these errors were encountered: