Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add experimental flag to make cpp compile actions shareable #16863

Closed
wants to merge 1 commit into from

Conversation

torgil
Copy link
Contributor

@torgil torgil commented Nov 28, 2022

See discussion in #14294

@lberki
Copy link
Contributor

lberki commented Nov 28, 2022

cc @oquenchil @joeleba @sdtwigg

I have only superficially followed the discussion on #14294 , but this seems to be really dangerous because C++ compilation actions are of the rare kind of actions that have state, and therefore Action.getInputs() could potentially return different things depending on which instance of a shared action is called which looks dangerous.

In addition, this may cause trouble for Joe's effort in merging the execution and the analysis phases.

Unless we are confident that these are not going to cause problems, I'd rather not merge this pull request.

@torgil
Copy link
Contributor Author

torgil commented Nov 28, 2022

In addition, this may cause trouble for Joe's effort in merging the execution and the analysis phases.

#14236 is the main reason for my pull request and in there @gregestren explains that the main issue with shareable cpp actionbs is with the include scanning (whose flag is already called "--experimental_unsupported_and_brittle_include_scanning"). Should we issue a warning if both flags are set? From the include scanning thread we can read that use of implementation_deps/interface_deps is preferred over include scanning.

Are there any other problems arising from merging the execution and the analysis phases?

@torgil torgil force-pushed the cpp-actions-shareable branch from db65251 to d178082 Compare November 28, 2022 14:05
@sgowroji sgowroji added team-Configurability platforms, toolchains, cquery, select(), config transitions awaiting-review PR is awaiting review from an assigned reviewer labels Nov 28, 2022
@lberki
Copy link
Contributor

lberki commented Nov 29, 2022

In addition to include scanning, there is also .d file parsing which also mutates the action graph (by removing inputs from the C++ compile action) which is in widespread use in Bazel.

Include scanning is an experimental feature in Bazel but used widely at Google, so we can't just cavalierly ignore its adverse interactions with other features like the one proposed in this change.

However, this issue with action conflicts and --trim_test_configuration looks like one that we should also be experiencing at Google; @sdtwigg is that so? If so, what do with it there? All in all, I would much prefer solving this in the same way everywhere.

@oquenchil
Copy link
Contributor

Drive by comment without understanding what is going on with configurations at all and why this is necessary. As far as I understand shared actions are not desirable (see BazelCon 2022 presentation) and they are something we'd ideally remove. Taking that into account I think it's not right to rely on spreading shared actions further to fix a problem caused by something else. So even though I don't understand the details here I'm with Lukacs that ideally Stephen would suggest an alternative solution.

@gregestren
Copy link
Contributor

gregestren commented Nov 29, 2022

@oquenchil and others: the basic issue #14294 highlights is --trim_test_configuration removes test options from the build configuration after you leave top-level tests. This is normally a good thing for performance. But if for some reason a non-test depends on a cc_test and you call $ bazel build //:all, that includes two versions of the cc_test: the top-level one has test options in its config, and the variant that's a dep of some non-test target that removes those options. So you get two instances of the same cc_test writing outputs with technically different configurations, hence action conflicts.

@sdtwigg has various good workarounds for that issue. But I notice in #14294 (comment) that's not the scenario @torgil is experiencing - @torgil 's scenario is building a single target. The above shouldn't apply when building a single target. @torgil - how does your build trigger this problem?

@torgil
Copy link
Contributor Author

torgil commented Nov 30, 2022

@torgil 's scenario is building a single target. The above shouldn't apply when building a single target. @torgil - how does your build trigger this problem?

@gregestren By a single target I meant a higher level target like a QA report, which potentially could have multiple paths to a single library (through both tests and tools) with various transitions along the way ("do not optimize", "add sanitizer",. etc).

@torgil
Copy link
Contributor Author

torgil commented Nov 30, 2022

Taking that into account I think it's not right to rely on spreading shared actions further to fix a problem caused by something else.

@oquenchil The problem in this case is inherent to the design and is currently very hard to workaround without the shareable cpp actions patch (both for upstream and downstream situations). We use more patches on this line, most notably #13587 and a "test_output_directory_suffix" option where we get conflicts on the test-action outputs even when using bazel build command.

@oquenchil
Copy link
Contributor

The problem in this case is inherent to the design

To the design of what? Of trim_test_configuration?

@torgil
Copy link
Contributor Author

torgil commented Nov 30, 2022

The problem in this case is inherent to the design

To the design of what? Of trim_test_configuration?

To the design related to issue #14236 where Bazel needs to figure out a non-conflicting output directory name without knowledge of which actions that are affected by which build settings.

#14294 may have other solutions, I'm not directly affected by that issue.

@gregestren
Copy link
Contributor

I think the #13587 patch you're using is crucial to action conflicts arising here. Otherwise transitions, particularly with --experimental_output_directory_naming_scheme=diff_against_baseline should guarantee differently configured actions have unique paths.

Note that even aside from Bazel, linking the same C++ library twice in different configurations could cause bad ODR errors. I know you're trying to mitigate that by ensuring the actions are identical.

I think the largest concern is that Bazel / Skyframe themselves make the wrong choice because of the problems with mutating C++ actions lmentioned above. This could affect you even if you as a build manager are doing everything right.

@lberki from

Action.getInputs() could potentially return different things depending on which instance of a shared action is called which looks dangerous

do you remember the precise failure mode here? Is this only possible with incremental builds?

@lberki
Copy link
Contributor

lberki commented Dec 1, 2022

Given that core developers agree that this change is bad idea (see @gregestren's excellent writeup at #14236 (comment)) , I'll close this pull request without merging it. Let's continue the discussion on #14236.

@lberki lberki closed this Dec 1, 2022
@torgil
Copy link
Contributor Author

torgil commented Dec 2, 2022

I think the #13587 patch you're using is crucial to action conflicts arising here.

No. That patch have nothing to do with these action conflicts. It's remaining function is only to fix issues where the ST-hash is included in the output path (not output directory name) for artifacts produced by some native actions (for instance middle-man).

Cherry-pick #16844 (updated on master-branch of today) and run case 2 in #14236 to reproduce one of the action conficts this patch resolves.

@lberki please reopen this pull request or suggest an alternate solution

@torgil
Copy link
Contributor Author

torgil commented Dec 2, 2022

Note that even aside from Bazel, linking the same C++ library twice in different configurations could cause bad ODR errors. I know you're trying to mitigate that by ensuring the actions are identical.

It's the action hash that makes sure the actions are identical, if that is different the actions will not be shared. If we can't rely on that we will need to send both the action hash and the configuration hash to remote cache / remote execution servers.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
awaiting-review PR is awaiting review from an assigned reviewer team-Configurability platforms, toolchains, cquery, select(), config transitions
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants