[WIP] Combine run-pass tests into a single crate #49142

Zoxc · 2018-03-18T17:44:39Z

This combines run-pass tests into larger crates.

Some timings:

System	Before	After
Windows	4.9 min	2.4 min
Fedora VM	3 min	1.2 min
Travis (probably unreliable numbers)	9.3 min	5.3 min

Zoxc · 2018-03-20T13:02:36Z

cc @rust-lang/compiler
cc @rust-lang/infra

kennytm · 2018-03-20T13:17:45Z

To future contributors, when should // no-combine be needed? To future reviewers, how to tell when a run-pass test should require // no-combine, and will there be any silent breakage rendering the test invalid if // no-combine is missing?

BTW I think the same strategy could be applied to doc tests.

alexcrichton · 2018-03-20T16:38:28Z

I'm personally not totally convinced we should do this, we did this a long time ago and it became untenable quickly as tests were never written for such a compilation and it would cause conflicts on various tests. Even here, for example, there's a lot of tests tagged with // no-combine which further creates friction when writing a test case. It's already a pain to have to throw on // ignore-emscripten to half the tests and this seems like it's "yet another thing" only discovered on CI as it's not enabled locally.

The wins here are nice but seem modest enough that it's not necessarily worth the increased friction in the test suite and maintenance on our end

nikomatsakis · 2018-03-20T17:05:54Z

Hmm. This combines run-pass tests eagerly unless we annotate them with // no-combine? Indeed, as @alexcrichton suggests, this does sound like it will lead to a lot of confusing failures. :/

If we are going to do this, why would we not do it locally?

Zoxc · 2018-03-20T17:22:15Z

we did this a long time ago and it became untenable quickly as tests were never written for such a compilation and it would cause conflicts on various tests

I am curious to why it didn't work out previously. Do you remember some concrete problems?

We could infer no-combine on more tests. We could also suggest adding no-combine if the test fails when combined but passes otherwise.

The wins here are nice but seem modest enough that it's not necessarily worth the increased friction in the test suite and maintenance on our end

100% faster tests is not a modest gain. Incremental compilation will reduce the time spent in compilation making testing the overall bottleneck, so it is important to get faster tests. On AppVeyor, 2/3 of the time is spent on testing.

nikomatsakis · 2018-03-22T15:06:08Z

We discussed this in the @rust-lang/compiler meeting. It seems like the gains on tests are potentially substantial, if @Zoxc's numbers are representative. On the other hand, this involves a fair amount of complexity in the compiler itself (changes to feature gating system etc), and causes our testing code to stray further from the "real thing" that users will use.

Some concerns that were raised:

How likely are we to miss bugs that only occur with no-combine (and hence don't get tested)?
When a test failure occurs, how clear is it from travis why it failed?
How often will // no-combine tags be required and how easy is it to tell?
- @Zoxc proposes re-running failing tests in isolation.
To what extent would we get the same gains from a "long-lived" compiler process, such as what the RLS could probably use? (We have no idea how easy that would be to do, though.)

pietroalbini · 2018-03-22T15:40:24Z

Just for reference, while discussing this on IRC I did some analysis on splitting tests based on the feature flags they use (grouping tests with the same set of feature flags).

I personally think grouping based on the sets of features the tests are using (without an explicit blacklist) is better than grouping as much tests as possible together with all the feature flags available.

alexcrichton · 2018-03-22T16:07:47Z

I am curious to why it didn't work out previously. Do you remember some concrete problems?

This was way way long ago (like years before 1.0) when we did this, so my memory is a little hazy of this. If I recall correctly though, some issues were:

Back then we concatenated on Windows and didn't concatenate on other platforms. This means that you would often fail with weird errors on windows that you wouldn't know how to debug without picking up a Windows machine.
Back then it was a source concatenation rather than a compiler concatenation. A script was used to generate a large source file which rustc compiled. This had tons of bugs in it related to name resolution and such.
Tests quickly became very difficult to write in such a way that'd actually pass in all platforms. Most were unidiomatic or bending over backwards to being tested in all configurations

Some of these are solved with the strategy taken here, compiler support. That being said I don't think the pros outweigh the cons here, we already have a systemic problem of "my PR fails on CI but not locally" and by changing the testing strategy on CI further it inevitably increases that gap and continues to make tests harder and harder to write.

nikomatsakis · 2018-03-22T16:29:53Z

I personally think grouping based on the sets of features the tests are using (without an explicit blacklist) is better than grouping as much tests as possible together with all the feature flags available.

That makes a lot of sense, and would certainly be less intrusive.

bors · 2018-03-23T21:53:25Z

☔ The latest upstream changes (presumably #49308) made this pull request unmergeable. Please resolve the merge conflicts.

Zoxc · 2018-03-24T19:49:37Z

I've made some changes. I now group tests by features and run all tests with the same feature set together and no longer rely on rustc supporting modular features. rustc now asserts that the features inside a test module is equal to the feature set enabled by the crate. It also gives errors on attributes applied to test modules. I've fixed my path rewrite to work with macro expansion so macros no longer break tests. I also run tests in individual processes which avoid issues with tests messing up the global state and allows them to run themselves again. This doesn't seem to slow tests down by much. On Windows I'm up to 2.5 minutes for run-pass.

bors · 2018-03-24T21:39:32Z

☔ The latest upstream changes (presumably #49337) made this pull request unmergeable. Please resolve the merge conflicts.

Zoxc · 2018-03-25T02:29:30Z

I've added a whitelists of attributes allowed in a combined tests to catch attributes which could have a crate-level effect.

shepmaster · 2018-03-30T19:18:45Z

Ping from triage, @Zoxc ! Looks like you have failing tests; will you have time to address them sometime soon?

bors · 2018-04-02T21:19:41Z

☔ The latest upstream changes (presumably #49252) made this pull request unmergeable. Please resolve the merge conflicts.

nikomatsakis · 2018-04-09T17:58:33Z

@Zoxc

I've made some changes.

These changes definitely seem to be going in the right direction, from my POV. I'm much happier with "asserting" that things are equal vs changing the behavior...

alexcrichton · 2018-04-10T14:40:42Z

I'm personally still against changing the tests in such a blanket fashion. I do agree that this will result in a faster test suite, but the wins here are only a handful of minutes. I feel that the tradeoff isn't worth it with respect to the amount of new code to maintain here and the possible impact this can have on the test suite.

Zoxc · 2018-04-10T18:32:55Z

This also applies the same strategy for documentation examples in rustdoc. Using it there seems less risky as language edge cases aren't being tested. The test suite seems to spend a lot of time building those examples. I don't have any numbers for that yet though.

Even if we don't end up enabling this for run-pass/compiletest, I'd still like the code for it to land as this makes it much easier to change compiletest to compile multiple test in a single rustc process, once rustc gets proper support for that.

alexcrichton · 2018-04-11T16:53:09Z

It's true yeah that rustdoc may be a better target here, but those are also notoriously flaky tests which are very difficult to tweak how they're generated. That means we'd for sure want it off by default to start with and may want to just take it slow on enabling it for CI builds.

pietroalbini · 2018-04-16T10:09:15Z

Ping from triage @Zoxc! What's the status of this PR?

pietroalbini · 2018-04-23T10:30:24Z

Ping from triage @Zoxc! What's the status of this PR?

Zoxc · 2018-04-23T20:10:08Z

This PR needs some more work. I'll just close it for now.

Zoxc force-pushed the run-fast-pass branch 2 times, most recently from 31ccebb to f646fc3 Compare March 18, 2018 19:16

kennytm added the S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. label Mar 18, 2018

Zoxc force-pushed the run-fast-pass branch 10 times, most recently from 9f17513 to 6d7c17b Compare March 20, 2018 03:04

Zoxc force-pushed the run-fast-pass branch from a76b024 to 34f88f3 Compare March 20, 2018 13:05

kennytm added T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. T-infra Relevant to the infrastructure team, which will review and decide on the PR/issue. labels Mar 20, 2018

Zoxc force-pushed the run-fast-pass branch 2 times, most recently from 338d469 to d6c335e Compare March 23, 2018 06:19

Zoxc force-pushed the run-fast-pass branch 2 times, most recently from b286ae3 to 15a2f7e Compare March 24, 2018 17:15

Zoxc force-pushed the run-fast-pass branch 2 times, most recently from ab20fff to 563e8ca Compare March 24, 2018 19:44

Zoxc added 8 commits March 25, 2018 04:30

Disable evil test

ce5c0ea

CI test

e377f19

Fix libtest-json test

e5ebf20

Debug on statements/expressions

740f9c5

FIXME

bbb9523

Combine run-pass tests into a single crate

895011d

Use an attribute whitelist

c1f0321

test

f59bd32

Zoxc force-pushed the run-fast-pass branch from b480f1f to f59bd32 Compare March 25, 2018 02:33

Zoxc added 3 commits April 5, 2018 10:05

rustdoc wip

63423e5

wip

b0e45b6

test fix

63ca997

Zoxc closed this Apr 23, 2018

pietroalbini added S-inactive Status: Inactive and waiting on the author. This is often applied to closed PRs. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Apr 23, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WIP] Combine run-pass tests into a single crate #49142

[WIP] Combine run-pass tests into a single crate #49142

Zoxc commented Mar 18, 2018 •

edited

Loading

Zoxc commented Mar 20, 2018

kennytm commented Mar 20, 2018

alexcrichton commented Mar 20, 2018

nikomatsakis commented Mar 20, 2018

Zoxc commented Mar 20, 2018

nikomatsakis commented Mar 22, 2018

pietroalbini commented Mar 22, 2018

alexcrichton commented Mar 22, 2018

nikomatsakis commented Mar 22, 2018

bors commented Mar 23, 2018

Zoxc commented Mar 24, 2018

bors commented Mar 24, 2018

Zoxc commented Mar 25, 2018

shepmaster commented Mar 30, 2018

bors commented Apr 2, 2018

nikomatsakis commented Apr 9, 2018

alexcrichton commented Apr 10, 2018

Zoxc commented Apr 10, 2018

alexcrichton commented Apr 11, 2018

pietroalbini commented Apr 16, 2018

pietroalbini commented Apr 23, 2018

Zoxc commented Apr 23, 2018

[WIP] Combine run-pass tests into a single crate #49142

[WIP] Combine run-pass tests into a single crate #49142

Conversation

Zoxc commented Mar 18, 2018 • edited Loading

Zoxc commented Mar 20, 2018

kennytm commented Mar 20, 2018

alexcrichton commented Mar 20, 2018

nikomatsakis commented Mar 20, 2018

Zoxc commented Mar 20, 2018

nikomatsakis commented Mar 22, 2018

pietroalbini commented Mar 22, 2018

alexcrichton commented Mar 22, 2018

nikomatsakis commented Mar 22, 2018

bors commented Mar 23, 2018

Zoxc commented Mar 24, 2018

bors commented Mar 24, 2018

Zoxc commented Mar 25, 2018

shepmaster commented Mar 30, 2018

bors commented Apr 2, 2018

nikomatsakis commented Apr 9, 2018

alexcrichton commented Apr 10, 2018

Zoxc commented Apr 10, 2018

alexcrichton commented Apr 11, 2018

pietroalbini commented Apr 16, 2018

pietroalbini commented Apr 23, 2018

Zoxc commented Apr 23, 2018

Zoxc commented Mar 18, 2018 •

edited

Loading