Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Flip --experimental_sibling_repository_layout to true #23576

Open
lberki opened this issue Sep 10, 2024 · 35 comments
Open

Flip --experimental_sibling_repository_layout to true #23576

lberki opened this issue Sep 10, 2024 · 35 comments
Assignees
Labels
incompatible-change Incompatible/breaking change P2 We'll consider working on this in future. (Assignee optional) team-ExternalDeps External dependency handling, remote repositiories, WORKSPACE file.

Comments

@lberki
Copy link
Contributor

lberki commented Sep 10, 2024

Motivation

To make the location of a file in the execroot the same regardless of whether it's in an external repository or the main one.

Description

Before the change, the source in the external repositories are located in the execroot at $EXECROOT/<main repo>/external/<external repo>/<repo-relative path>. This change turns that location into $EXECROOT/<external repo>/<repo-relative-path>.

Similarly, before the change, output files are located at $EXECROOT/<main repo>/bazel-out/<config>/bin/external/<external repo>/<repo-relative path> and this flag flip will turn that into $EXECROOT/<main repo>/bazel-out/<external repo>/<config>/bin/<repo-relative path>.

Incompatible Flag

--experimental_sibling_repository_layout

Migration Guide

This should mostly be a no-op change since everyone should refer to files using file.path in Starlark, $(location <label) in genrules and getting their names from the BEP when outside of Bazel.

If there are hard-coded paths in any of these places, they will need to be updated: preferably, to use one of these sanctioned approaches to learn the path or else hard-coding the new one, if that is not possible for some reason.

In which Bazel LTS version will this incompatible change be enabled?

Bazel 8

Additional Context

No response

TODO List

No response

@lberki lberki added untriaged incompatible-change Incompatible/breaking change labels Sep 10, 2024
@meteorcloudy meteorcloudy added P1 I'll work on this now. (Assignee required) breaking-change-8.0 team-ExternalDeps External dependency handling, remote repositiories, WORKSPACE file. and removed untriaged labels Sep 10, 2024
@meteorcloudy meteorcloudy changed the title Flip --experimental_sibling_repository_layout Flip --experimental_sibling_repository_layout to true Sep 10, 2024
@fmeum
Copy link
Collaborator

fmeum commented Sep 11, 2024

While I would really like to see this flipped, I think that we are not yet ready to do so:

  1. We lack a clear migration path for include directories in cc_* rules. It is very common for folks to hardcode -Iexternal/my_repo copts due to the lack of a local_includes attribute. I think that this needs to be addressed in a reusable way that projects can adopt while still remaining compatible with older Bazel versions.
  2. With the current sibling layout, the output directory part of exec paths will still have three segments for main repo files, but four segments for files in external repos, which introduces a new difference and breaks the long-standing assumption that you can strip the first three segments, starting with bazel-out, to get the root-relative path. This is used in wrappers and tools that need to compute relative paths between generated files and source files.
  3. The namespace collision between the config segment ("mnemonic") and external repo names makes it very difficult to reliably parse the repo name out of the exec path, which is something that a variety of tools and libraries (think runfiles libraries) need to do. This would become simpler after WORKSPACE is gone as we could somewhat recognize repo names by containing a +, which we could prohibit in config segments. Currently, this is the simplest regex for extracting repo name and root-relative path I could come up with - it's pretty hacky and relies on implementation details (mnemonics always contain a -, output directory basenames do not unless they are coverage-metadata) and lookahead:
(?:bazel|blaze)-out/(?:(?<repo>[^/]+(?=/[^/]+-[^/]+/)(?!/coverage-metadata/))/)?[^/]+/[^/]+/(?<path>.+)")
  1. We already expect users to finally fully migrate to Bzlmod for Bazel 8, which entails a number of changes to path layouts, and may be overstretching the complexity budget. However, this argument can also be turned around: If we provide proper solutions to the first three problems, we could ask users to migrate to supported abstractions over hardcoded paths and have that deal with two migrations at once.

@lberki
Copy link
Contributor Author

lberki commented Sep 11, 2024

@fmeum yes, I'm aware of all these downsides, especially the impending migration to bzlmod. Addressing these concerns in detail:

  1. Yes, this is in fact a problem, but I see no way of providing a transition period where the files are available at both locations without either doubling the number of input files to actions (by providing them at both places) or teaching all the ways actions can be executed about sibling repository layout (and that includes all RBE implementations). So yes, it's painful, but it's a band-aid that should eventually be ripped off (unless, of course, we decide that it should never be done...)

  2. Do you have evidence that the number of path segments is actually used? I would not be very surprised, but I was hoping that the use of $(location) and friends is prevalent enough that this would not be a big problem and our downstream projects pipeline seems to agree.

  3. My game plan for this was that we'd move the repository-specific output directories to ../<external repo>/bazel-out/<config>/bin eventually. I do realize that it's yet another incompatible change, but I thought that the value of reclaiming the external directory warrants it (the implementation of this would also not be simple, but that's our problem)

  4. Yeah, I could argue both ways so I count this as an argument neither for nor against :)

@meteorcloudy mind weighing in?

@fmeum
Copy link
Collaborator

fmeum commented Sep 11, 2024

  1. Yeah, we definitely shouldn't stage files in two locations. I think that we just need a well-designed replacement for hardcoded includes paths. Whether that's a new local_includes attribute, support for $(execdir ...) or $(dirname ...) expansions or just a macro in rules_cc that conditionally returns -I., -Iexternal/foobar or -I../foobar for the current package or a given label doesn't matter. There is still time to do this for Bazel 7.4.0 and Bazel 8, but I'm convinced we need to deliver some official and documented solution in coordination with C++ rules owners. Bonus points if it also works with Bazel 6 and earlier.
  2. I know that I have written a number of ctx.bin_dir.path.count("/") already, but only because I knew that --experimental_sibling_repository_layout exists. For someone who doesn't know this, it may appear to just be a constant. I don't have concrete evidence for this though.
  3. But what's the plan for the meantime? Whether a given runfiles tree uses the new or old layout isn't encoded anywhere in .runfiles, so runfiles libraries would need to support the legacy layout, the current version of the sibling layout as well as the future version, all just based on regexes, e.g. in
    local -r repository=$(echo "$normalized_caller_path" | __runfiles_maybe_grep -E -o '(^|/)(bazel-out/[^/]+/bin|bazel-bin)/external/[^/]+/' | tail -1 | awk -F/ '{print $(NF-1)}')

    and
    https://github.com/bazelbuild/rules_go/blob/1fa9a54177de7ad59fc5e39e3fd52a3340a82fa2/go/runfiles/global.go#L53
    At this point I'm not even sure whether that's possible unambiguously and I'm worried of the complexity leading to e.g. top-level bin or coverage-metadata directories in projects ending up causing weird runfiles issues.
  4. I may even consider it a plus if not for the "eventually we will migrate the sibling repo layout to some v2 with consistent output dirs" part ;-)

What I wonder about in general: How important is reclaiming the external directory going to be for users? Bzlmod brings a huge improvement in maintainability and thus justifies a larger migration, but wouldn't external being available as a top-level directory mostly matter for large OSS projects that want to adopt Bazel but can't due to existing directory structures? Most companies would probably just rename the directory when adopting Bazel.

@tjgq
Copy link
Contributor

tjgq commented Sep 11, 2024

Would #23518 completely address (1), or is there something else still missing?

@fmeum
Copy link
Collaborator

fmeum commented Sep 11, 2024

Would #23518 completely address (1), or is there something else still missing?

Not in its current form as it doesn't work for labels that produce multiple files. We may also need a way to get the relevant path for the current package without an explicit dependency on something (as required by location expansion).

There are a few related issues files on the Bazel repo that ideally would be addressed holistically.

@meteorcloudy
Copy link
Member

@fmeum Thanks for the comprehensive analysis, it wasn't clear to me that all the breakages it would cause (most worried about breaking the runfiles library).

What I wonder about in general: How important is reclaiming the external directory going to be for users?

I think this is a very important question to answer. @lberki IIRC, there is currently no large project being blocked on this? At this point, I feel the cost of this change might significantly outweigh the benefits.

If we do want to do it, we should probably come up with a better migration plan for all mentioned breakages, I'm not sure if we have enough time before Bazel 8 cut.

@fmeum
Copy link
Collaborator

fmeum commented Sep 11, 2024

A benefit could be that it simplifies the --legacy_external_runfiles flip, but that flip may be possible independently, even for Bazel 8.

@meteorcloudy
Copy link
Member

And is there any benefit from flipping --legacy_external_runfiles without flipping --experimental_sibling_repository_layout? It cannot free up the "external" directory alone, right.

@fmeum
Copy link
Collaborator

fmeum commented Sep 11, 2024

It can drastically cut down the number of symlinks in the sandbox, which is a performance win.

@lberki
Copy link
Contributor Author

lberki commented Sep 12, 2024

After discussing it with @meteorcloudy, I now think it's best to give up on this flag.

I do realize this is reminiscent of the discussion we had on #20500, where we came to the conclusion that it's best to flip the flag, but then, we thought that making the downstream projects pipeline green is good enough, which is in theory the case, but we can't ignore all the incompatibilities this change would cause that we don't know about :(

In addition, the new layout has shortcomings as @fmeum discussed above (there are good reasons why those shortcomings exist but they are shortcomings nevertheless) and those can only be solved by changing the layout, which would amount to an almost complete rewrite of the flag.

Further, whereas the fact that one cannot have a directory called external/ in Bazel repositories is very embarrassing, I don't know anything that hurts a lot because of this limitation, so it's not that the high costs associated with the flag flip would bring a lot of benefit (other than removing an embarrassment).

So for this flag, I propose that we give up on it, make it a no-op, then, if we do find a good reason to try again, we do so with a better layout (probably ../<external repo>/blaze-out/<config>/bin)

However, if we accept that this flag is best abandoned, it also calls into question the reason for --legacy_external_runfiles: for that, I propose that the right approach is to abandon the new locations instead of the old ones so that the layout of the execroot and the runfiles tree is consisten. Bazel cannot do that at the moment but it's trivial to teach it to. @meteorcloudy kindly agreed to run an experiment as to how harmful that would be.

I won't commit to this plan right now to give some room for protests, but given that the cost of flipping the flag is high (if done right) and that it's not perfect, it's hard to imagine an argument...

@meteorcloudy
Copy link
Member

meteorcloudy commented Sep 12, 2024

I propose that the right approach is to abandon the new locations instead of the old ones so that the layout of the execroot and the runfiles tree is consisten.

In case the new layout is already widely used (I do suspect so), I wonder if we can keep both layouts and reduce symlinks by doing:

bin.runfiles/
├── _main
│   ├── external
│   │   ├── repo1 -> ../../repo1
│   │   └── repo2 -> ../../repo2
│   └── pkg
│       └── bin
├── repo1
│   └── pkg
│       └── file
└── repo2
    └── pkg
        └── file

or

$ tree bin.runfiles/
bin.runfiles/
├── _main
│   ├── external
│   │   ├── repo1
│   │   │   └── pkg
│   │   │       └── file
│   │   ├── repo2
│   │   │   └── pkg
│   │   │       └── file
│   │   └── repo3
│   │       └── file
│   └── pkg
│       └── bin
├── repo1 -> _main/external/repo1
├── repo2 -> _main/external/repo2
└── repo3 -> _main/external/repo3

This will hopefully reduce the number of runfiles but keep things compatible.

@fmeum
Copy link
Collaborator

fmeum commented Sep 12, 2024

I propose that the right approach is to abandon the new locations instead of the old ones so that the layout of the execroot and the runfiles tree is consisten.

I don't think that consistency between execroot and runfiles tree matters much to users. In fact, I actually consider seeming consistency to be dangerous: runfiles paths would only look like exec paths for source files, but would still deviate for generated files. With --nolegacy_external_runfiles, it's at least obvious that whatever works for source files will also work for generated files, which provides some kind of "type safety".

The non-external runfiles layout is also much nicer to work with in isolation as ../<other_repo> gets you from any repo to any other. This is more relevant for runfiles, which are often used to run non-Bazelized tools that care about the exact layout of files rather than receiving their paths as arguments.

Even with --experimental_sibling_repository_layout abandoned in its current form, I think that flipping --legacy_external_runfiles to False is still the best option for runfiles and also results in a mostly manageable migration. It also makes sense to have this migration in parallel with the one to Bzlmod as there already are useful abstrations ($(rlocationpath ...), runfiles libraries) that you can migrate to that help with both.

@lberki
Copy link
Contributor Author

lberki commented Sep 12, 2024

@fmeum I could be convinced either way (I'm not sure if the old or new locations are hard-wired more frequently), but I thought you specifically warned against flipping one flag but not the other at #23574 (comment) ?

I think the non-external layout is by far nicer to work with (just like for repositories), the question is just how much cost the flag flip incurs vs. how much nicer it is.

@fmeum
Copy link
Collaborator

fmeum commented Sep 12, 2024

I just wanted to call out additional breakage points. But these are already weak points (e.g. a generated Java runtime won't be found in runfiles with $(JAVA) even today). We could just add a new variable with the runfiles path (preferably even rlocationpath), add this to the migration notes and call it a day.

Just based on my experience, I would consider hardcoding of the new runfiles layout to be far more prevalent than of the old layout.

Many folks have the flag flipped already to speed up their large builds. It's mostly newcomers that pay the performance cost at this point, so I'm definitely in favor of flipping it.

@lberki
Copy link
Contributor Author

lberki commented Sep 12, 2024

Ack, I thought that you think it's an especially bad breakage. I don't have data as to which approach would be less breaking, so I propose that we take a look at the Bazel downstream pipeline and make a decision based on the number of breakages there. @meteorcloudy WDYT?

@peakschris
Copy link

My reading is that if we did this flag flip some paths would become shorter, not longer. Just mentioning this as a consideration; on windows we have significant difficulties with tools unable to process files inside paths that are too long. Sad that these tools exist, but true. For us, these issues impact us more in the main workspace rather than external repos, but we certainly should try not to make paths longer.

@meteorcloudy
Copy link
Member

For us, these issues impact us more in the main workspace rather than external repos, but we certainly should try not to make paths longer.

Totally understand that every character on Windows is very precious. But I think that mostly apply to the MSVC compiler and you can find workaround for most cases? As @fmeum pointed out in previous discussions, the new sibling layout has a few drawbacks that are hard to deal with, not mention the migration cost. So I don't think this is a strong argument to push --experimental_sibling_repository_layout through.

@yuzhy8701
Copy link
Contributor

One example that this flag is still useful is the "android-emulator-next" project (link), where we switched from cmake to bazel and rely on this flag to free up the //external/... namespace.

Generally speaking, this is the only benefit of the flag but an important one for projects where an external directory is already there and contains the majority of the codebase.

If this flag has to die, I would suggest providing some other means to access the external directory, preferably not having to declare everything as new_local_repositorys?

@meteorcloudy
Copy link
Member

@lberki How easy would it be to provide an option to rename the "external" directory to something like "bazel-external" or "_bazel_external"?

@lberki
Copy link
Contributor Author

lberki commented Sep 13, 2024

@meteorcloudy annoying, but doable. The main problem with that is that it would necessarily have to be the decision made in the main repository, but one that would affect every other: say, if the main repository set external to be called fluffykitten, every project would have to say -Ifluffykitten/project/include instead of -Iexternal/project/include.

Sure, they should not hard-wire the string external in the first place, but if people did that, the migration to sibling repositories would be much easier.

@yuzhy8701
Copy link
Contributor

yuzhy8701 commented Sep 13, 2024

Sure, they should not hard-wire the string external in the first place, but if people did that, the migration to sibling repositories would be much easier.

One major reason why sibling repository breaks rules and builds is that, people rely on the documented directory layout when writing rules (e.g. extract a certain path fragment, assemble a path to a file, etc), and sibling repository changed that with per-repo bazel-bin, etc. The changes needed to make them work again with sibling repository is sometimes non-trivial (e.g. da9ef5b).

Simply changing the directory name will in a lot of cases not breaking that assumption unless they hardcode external - which is easy to fix by just s/external/bazel_external_dirname()/g (assume that builtin API returns the external directory name). For cases like "-Iexternal/project/include" you just do "-I{}/project/include".format(bazel_external_dirname()) instead, which I suppose is a trivial fix?

Actually, you can even add a flag to set the name of that directory as user needs it - no conflicts ever again.

@lberki any thoughts?

@lberki
Copy link
Contributor Author

lberki commented Sep 16, 2024

@yuzhy8701 this is regrettable because this is the exact reason why I was originally so keen on flipping this flag: I couldn't have imagined that no one has a project that requires using the external/ subdirectory :(

I don't think it's good idea to provide two mechanisms to piece together Bazel workspaces from multiple repositories because there is just too much room for adverse interaction between them and too much extra user interface.

FWIW, a large part of the problem with sibling repository layout is that people do hardcode the external/ path segment and it's a word that's way too common to just simply search-and-replace for it. You'd also need to do this in your transitive dependencies...

So I'm afraid I don't have a very good alternative to you; in fact, --experimental_sibling_repository_layout originally came from the desire to support files under that directory.

@meteorcloudy @fmeum maybe you have a genius idea?

@fmeum
Copy link
Collaborator

fmeum commented Sep 16, 2024

No genius idea, but if we want to eventually support projects with an external directory, why don't we do a survey of the places in which users feel the need to hardcore the string? Then we can work on abstractions that resolve these use cases in other ways and eventually rename the directory or take another stab at consolidating the layout.

@lberki
Copy link
Contributor Author

lberki commented Sep 16, 2024

My model of the state of the world is that there aren't many reasons to harcode these things, people just do. I admit, though, that this is not based on any sort of data, it's only a suspicion.

@fmeum
Copy link
Collaborator

fmeum commented Sep 16, 2024

How about starting a discussion where we collect these cases? C++ rules are definitely a case where we need to offer a convenient alternative, but there could be more.

@peakschris
Copy link

Actually, you can even add a flag to set the name of that directory as user needs it - no conflicts ever again

I like this idea. It seems flexible. With this flag, the experimental_sibling_repository_layout flag could be left unflipped and removed. Is there ever a reason to support sibling layout if the 'external' name can be changed to avoid conflict?

@lberki
Copy link
Contributor Author

lberki commented Sep 16, 2024

@peakschris see #23576 (comment) -- this is doable, but then every repository in the build would need to know what value the top-level project sets this string to. And if that's acceptable, we could just as well flip the flag as it is.

@meteorcloudy
Copy link
Member

but then every repository in the build would need to know what value the top-level project sets this string to.

Or the root project just have to patch their dependencies for this renaming? I think for a project wants to work around this limitation, that's a fair price they need to pay? @yuzhy8701 Do you think this is acceptable?

@yuzhy8701
Copy link
Contributor

@meteorcloudy Fair point.

FWIW, a large part of the problem with sibling repository layout is that people do hardcode the external/ path segment

That was my original point - if a project hardcodes external/ it won't work with sibling repo and needs a fix anyways (if you flip the flag). I've made several fixes to some of our dependencies (rules_rust, grpc, and the bazel-internal cc rules). Compared with the fix for sibling repo, simply replacing external with something else is typically easier. This gives you a higher chance that a fix will ever be made, and that the main repo patching all its dependencies sound more viable when the fixes are easy.

@meteorcloudy
Copy link
Member

@lberki OK, sounds like what we should do is

  • Add a flag for renaming the "external" directory (but never flip it).
  • Remove --experimental_sibling_repository_layout and its relevant logic.

@meteorcloudy meteorcloudy added P2 We'll consider working on this in future. (Assignee optional) and removed P1 I'll work on this now. (Assignee required) labels Sep 17, 2024
@yuzhy8701
Copy link
Contributor

SG - and also,

  1. the API to access this value from where needed (in a backward compatible way)
  2. the corresponding documentation (e.g. update https://bazel.build/remote/output-directories#layout-diagram to make people aware of the variable name).

@peakschris
Copy link

The renaming flag seems like a good approach. To elaborate on this one piece:

  1. the API to access this value from where needed (in a backward compatible way)

We currently have 'external' hardcoded in two places:

a) In a run_binary (from aspect) rule, in both args and env sections. A variable substitution would be needed
b) Within a custom tool called by a bazel rule, the 'external' value would need to be passed to this as a command line input

@lberki
Copy link
Contributor Author

lberki commented Sep 18, 2024

@meteorcloudy if that's doable -- that string is embedded pretty deeply into Bazel and I'd not be surprised if some people even depended on the fact that the directory where external repositories are located in the output base ($OUTPUT_BASE/external) has the same basename as the directory under which they are located under the execroot.

I'll give it a stab as time allows to see how intrusive it would be and if it's even possible. I hope I can make it a regular command line option and not a startup one.

@fmeum
Copy link
Collaborator

fmeum commented Oct 30, 2024

Since #24126, non-main repos can now contain a top-level external directory or package even without this flag enabled.

Have projects that currently use the flag tried to bring in their external package as a separate repository via local_repository? Other than having to use @external//pkg:foo instead of //external/pkg:foo, I don't see any downside. Is there anything that makes this not a good idea in practice?

@lberki
Copy link
Contributor Author

lberki commented Oct 31, 2024

That approach has worked since ages immemorial, it's just that it's surprising and finicky.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
incompatible-change Incompatible/breaking change P2 We'll consider working on this in future. (Assignee optional) team-ExternalDeps External dependency handling, remote repositiories, WORKSPACE file.
Projects
None yet
Development

No branches or pull requests

9 participants