Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor plumbing for test goal to support batched tests. #17044

Closed

Conversation

danxmoran
Copy link
Contributor

@danxmoran danxmoran commented Sep 28, 2022

Pre-work for #14941

The new structure is heavily inspired by @thejcannon's recent work to refactor the way lint and fmt batch inputs. I don't expect any change in end behavior from this diff - all plugins providing a test implementation are coded to return single-element partitions. Once the plumbing has merged to main I plan to follow-up with an implementation of pytest batching.

[ci skip-rust]

[ci skip-build-wheels]

The new structure is heavily inspired by recent work to refactor the way
`lint` and `fmt` batch inputs. I don't expect any behavior change from
this diff - all plugins providing a `test` implementation are coded to
return single-element partitions. Once the plumbing has merged to `main`
I plan to follow-up with an implementation of `pytest` batching.

Known TODOs:
  - Restore the warning/error raised if no applicable targets are found
    for the input specs
  - Restore use of environments in this flow

[ci skip-rust]

[ci skip-build-wheels]
@danxmoran danxmoran added category:internal CI, fixes for not-yet-released features, etc. category:plugin api change labels Sep 28, 2022
@danxmoran danxmoran removed the category:internal CI, fixes for not-yet-released features, etc. label Sep 28, 2022
I deleted a handful of tests when I was getting started because I
thought they would no longer be relevant. Turns out they continue to
work with a few tweaks, so we can restore them to avoid losing coverage.

[ci skip-rust]

[ci skip-build-wheels]
[ci skip-rust]

[ci skip-build-wheels]
Tracking the name of the test tool didn't really give us much - delete
it, and simplify the code that used to use it.

[ci skip-rust]

[ci skip-build-wheels]
Copy link
Contributor

@Eric-Arellano Eric-Arellano left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks good!

src/python/pants/core/goals/test.py Outdated Show resolved Hide resolved
src/python/pants/core/goals/test.py Outdated Show resolved Hide resolved
src/python/pants/core/goals/test.py Outdated Show resolved Hide resolved
src/python/pants/core/goals/test.py Outdated Show resolved Hide resolved
[ci skip-rust]

[ci skip-build-wheels]
Hard-code a value of 1 for now where it was being used.

[ci skip-rust]

[ci skip-build-wheels]
Look up the environment name for each element in a test partition, and
assert that everything in a partition matches.

[ci skip-rust]

[ci skip-build-wheels]
@danxmoran danxmoran marked this pull request as ready for review September 29, 2022 21:35
Copy link
Contributor

@Eric-Arellano Eric-Arellano left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work!

src/python/pants/backend/go/goals/test.py Outdated Show resolved Hide resolved
src/python/pants/core/goals/test.py Outdated Show resolved Hide resolved
src/python/pants/core/goals/test.py Outdated Show resolved Hide resolved
src/python/pants/core/goals/test.py Outdated Show resolved Hide resolved
src/python/pants/core/goals/test.py Outdated Show resolved Hide resolved
name = SuperTesterSubsystem.options_scope
field_set_type = SuperTesterFieldSet

Then, define 4 `@rule`s:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe this should live in our docs guide rather than here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, lint.py has a similar comment which is why I added this here, but it'd be better to copy/move to one of the .md files.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Opened #17070

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm ok with that with the caveat we add a link here. In general though because of the lost locality I fear the docs getting stale

src/python/pants/core/goals/test.py Outdated Show resolved Hide resolved
src/python/pants/core/goals/test.py Outdated Show resolved Hide resolved
src/python/pants/core/goals/test.py Outdated Show resolved Hide resolved
[ci skip-rust]

[ci skip-build-wheels]
Copy link
Member

@thejcannon thejcannon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Only looked at test.py. So far so good!)

src/python/pants/core/goals/test.py Outdated Show resolved Hide resolved
src/python/pants/core/goals/test.py Outdated Show resolved Hide resolved
src/python/pants/core/goals/test.py Show resolved Hide resolved
@thejcannon
Copy link
Member

check seems to be the holdout for the new style. @stuhood should we change it to be consistent? Do we think it's see similar benefits?

Add a separate field for `extra_output_prefix` for clarity

[ci skip-rust]

[ci skip-build-wheels]
[ci skip-rust]

[ci skip-build-wheels]
This is currently only used by `pytest`, and we have a couple other
fields with the same problem (de-duplicating between test runs) that we
currently expect plugin implementers to handle. Incidentally, deleting
the field flushed out some invalid `SubPartition`s built in tests that
snuck through the previous cleanup commit.

[ci skip-rust]

[ci skip-build-wheels]
@Eric-Arellano
Copy link
Contributor

LGTM. @thejcannon can you please give a final sign off before merging?

Copy link
Member

@thejcannon thejcannon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will review tomorrow! 👍

@distinct_union_type_per_subclass(in_scope_types=[EnvironmentName])
@dataclass(frozen=True)
@runtime_ignore_subscripts
class SubPartition(Generic[_FieldSetT]):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After hacking on the batched/grouped pytest implementation, I find myself not loving the schema of this type:

  • Most plugins will (or at least currently) run with a single field-set per partition, so you end up with a lot of asserts on len(field_sets) at the beginning of plugin rules to ensure only 1 thing is passed in. Feels like something the type system should be checking for us.
  • In the pytest case I find myself writing a decent amount of partition.field_sets[0] to get at fields like extra_env_vars, with the invariant that every field_set in the partition has the same values for those fields. Again feels like something I wish I could model in the type system by defining a new class with a tuple[Address, ...] and then non-tuple fields for values that should be constant within a partition.

I am considering this refactor:

  • Introduce a new generic type var for _PartitionElementT
  • Change Partitions to Partitions(FrozenDict[Any, _PartitionElementT])
  • Change SubPartition to SubPartition(Generic[_PartitionElementT])

Then for plugins that test one-file-per-process, _PartitionElementT = PluginTestFieldSet. But in the pytest case I can create whatever class I want to conveniently model the metadata.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(numbering your first 2 points)

For 1, don't forget that the types in question are registered via class lookup and instatiated using positional args. What actuallt gets union-registered and instantiated is technically opaque and can be type-punned. All this to say if the TestRequest subclass defined its own SubPartition, we'd register and use that. So we could easily provide a (ugly name nonwithstanding) SingleSubPartition class which subclasses of TestResult can declare as their SubPartition classvar. Inside it it could do the asserting and provide "singular" types (e.g. request.field_set)

For 2,for fmt/lint I put those in the partition key which is provided to SubPartition.

I'm thinking more and more we should land the "one generic partition schema" change before this one. We can agree on the "generic" impl of these things there.

Thoughts?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer not to block this on the generic design, both because I want to keep this moving forward and because I generally find more success refactoring code into a generic form if there are multiple sufficiently-different examples to pull from (vs. designing the generic thing first based on 1 (or a few very similar) examples, and then finding all the spots you missed when you try to extend it to a slightly-different use case).

That said I'm not opposed to letting this PR hang out a bit longer while I iterate on the APIs/schema. I'll take another pass at using key as the mechanism for passing along the values that are identical across the batch.

I'm also wondering if we can/should be more flexible about forcing plugins to declare all the partitioning boilerplate if all they want to do is test 1 field-set per process. Could we instead say:

  • For one input per process, define @rule(TestFieldSetSubclass) -> TestResult
  • For grouped testing, define @rule(TestRequestSubclass.PartitionRequest) -> Partitions and @rule(TestRequestSubclass.SubPartition) -> TestResult
    And then have the core test logic decide which rules to invoke based on which TestFieldSubclass instances are covered by the registered TestRequestSubclass.field_set_types?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this case I think we can use the unions and user-declarations of types to do the legwork for us. In fact we're basically "there" in terms of what you need in fmt/lint.

For lint/fmt I'm about to PR having the core type provide an opt-in/out "default" partitioner rule making one giant partition. You want the same thing, but flipped (N partitions of one element). So that handles the rule boilerplate.

Then there's the single-v-many boilerplate. I think given the above, we can easily play type-punny games such that the single rule users define is TestRequestSubclass.SingleFieldSetSubPartition -> TestResult (name for exposition).

That way the core code is blissfully generic, but customization and default-ization is possible. I can also demo if this is all greek 😛

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer not to block this on the generic design, both because I want to keep this moving forward and because I generally find more success refactoring code into a generic form if there are multiple sufficiently-different examples to pull from (vs. designing the generic thing first based on 1 (or a few very similar) examples, and then finding all the spots you missed when you try to extend it to a slightly-different use case).

+1 to not blocking. A big difference with test is we only currently support target-aware implementations, and no one is (yet) asking for target-less test, which would lose benefits like precise caching; we'd have to copy the whole repo into the sandbox!

@benjyw
Copy link
Contributor

benjyw commented Oct 5, 2022

Sorry, haven't had a chance to go deep into this yet, will try to get to it today. But one question - I notice the batching functionality is very specific to test. Yet we have similar functionality for lint/fmt, and it could be useful in other cases (e.g., dep inference). So I'm wondering if and how to generalize?

@stuhood
Copy link
Member

stuhood commented Oct 5, 2022

But one question - I notice the batching functionality is very specific to test. Yet we have similar functionality for lint/fmt, and it could be useful in other cases (e.g., dep inference). So I'm wondering if and how to generalize?

IMO: #16967 ... but I don't want to block @danxmoran's progress here, since I don't expect that the APIs for any of lint/fmt/test are "finished".

@danxmoran danxmoran marked this pull request as draft October 5, 2022 17:30
@danxmoran
Copy link
Contributor Author

I worry that trying to squeeze everything into the lint/fmt API structure at this stage will be more pain than gain - in the past I've had more luck with that sort of refactor if I have a few concrete, sufficiently-different examples to pull from. That said, @thejcannon already has some ideas for a more uniform structure + helpers to avoid boilerplate for plugin authors, so maybe I will soon be proven wrong 😄

Unified with lint/fmt or not, based on the diff in this PR and working ahead on the pytest partitioning impl I am not super happy with the shape of the new APIs here. I am personally questioning whether we can avoid forcing plugins that test one-file-per-process to think about partitioning at all (see #17044 (comment)). I've converted this PR to draft while I poke at alternate ideas.

@thejcannon
Copy link
Member

thejcannon commented Oct 5, 2022

To tack on, fmt and lint are rounding up soon. Although I'm happy to re-mold for a generic "base" which includes test. It wouldn't be hand and I'll probably do it (with this PR in mind) as a nice logical boundary separation (pre-#16967).

Copy link
Contributor

@benjyw benjyw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for doing this heavy lifting!

I had a couple of quick questions/comments I wanted to throw out sooner than I can do a full deep-dive review.

@@ -481,28 +625,49 @@ class TestExtraEnvVarsField(StringSequenceField, metaclass=ABCMeta):
)


@rule_helper
async def _get_partitions_by_request_type(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is the return type dict's value a list[Partitions] and not just a Partitions? AFAICT from the implementation it will always contain exactly one Partitions?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I ask because the multiple levels of plurality are a mite confusing...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If so then right now those concepts seem mingled. E.g., partition_per_input() is on Partition, not SubPartition.

We had discussed the terminology on Slack I believe? Should that be introduced here?

I do think it makes sense to have separate rules for the two different concepts.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The terminology discussion on Slack was mainly around what field names to expose to end users, which (purposefully) isn't touched in this PR - the intent here is to get the generic plumbing in place.

One of my take-aways from this PR is that it's hard to effectively design/review generic plumbing without a concrete implementation to look at 😅

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gotcha. The terminology, and functionality, for plugin authors and end users should probably be consistent though?

Right now it's not clear to me where the logic for sub-batching-by-size would go, for example.

@distinct_union_type_per_subclass(in_scope_types=[EnvironmentName])
@dataclass(frozen=True)
@runtime_ignore_subscripts
class SubPartition(Generic[_FieldSetT]):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Terminology-wise, I'm not clear on the distinction between a Partition and a SubPartition. Can you elaborate in the comments here?

Copy link
Member

@thejcannon thejcannon Oct 5, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume that's stolen from the Lint changes to some degree (which doesn't have a Partition object, but might if we align).

The reason I named it SubPartition over there was because I wanted to convey to the plugin author that the partition they gave us can (and likely will) be sliced into sub-partitions. Type-wise they're identical.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. So the idea is that Partition captures "tests that are allowed to run together" and SubPartition captures "tests that will run together, based on some batch size"?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is how it started, yes. Then I removed the top-level batch size limit so now they are effectively the same thing - I've since decided that the top-level limit probably needs re-added but I haven't gotten around to the implementation

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't help I keep chiming in here and in Slack with the gospel of re-use between this and fmt/lint.

@danxmoran
Copy link
Contributor Author

I've opened #17134 as a "clean slate" redo of this with hopefully less boilerplate / migration pain, based on @thejcannon's recent work. Closing this one.

@danxmoran danxmoran closed this Oct 6, 2022
@danxmoran danxmoran deleted the danxmoran/batched-test-plumbing branch October 26, 2022 20:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants