-
-
Notifications
You must be signed in to change notification settings - Fork 636
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor plumbing for test
goal to support batched tests.
#17044
Refactor plumbing for test
goal to support batched tests.
#17044
Conversation
The new structure is heavily inspired by recent work to refactor the way `lint` and `fmt` batch inputs. I don't expect any behavior change from this diff - all plugins providing a `test` implementation are coded to return single-element partitions. Once the plumbing has merged to `main` I plan to follow-up with an implementation of `pytest` batching. Known TODOs: - Restore the warning/error raised if no applicable targets are found for the input specs - Restore use of environments in this flow [ci skip-rust] [ci skip-build-wheels]
I deleted a handful of tests when I was getting started because I thought they would no longer be relevant. Turns out they continue to work with a few tweaks, so we can restore them to avoid losing coverage. [ci skip-rust] [ci skip-build-wheels]
[ci skip-rust] [ci skip-build-wheels]
Tracking the name of the test tool didn't really give us much - delete it, and simplify the code that used to use it. [ci skip-rust] [ci skip-build-wheels]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall looks good!
[ci skip-rust] [ci skip-build-wheels]
Hard-code a value of 1 for now where it was being used. [ci skip-rust] [ci skip-build-wheels]
Look up the environment name for each element in a test partition, and assert that everything in a partition matches. [ci skip-rust] [ci skip-build-wheels]
[ci skip-rust] [ci skip-build-wheels]
[ci skip-rust] [ci skip-build-wheels]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work!
name = SuperTesterSubsystem.options_scope | ||
field_set_type = SuperTesterFieldSet | ||
|
||
Then, define 4 `@rule`s: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe this should live in our docs guide rather than here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, lint.py
has a similar comment which is why I added this here, but it'd be better to copy/move to one of the .md
files.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Opened #17070
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm ok with that with the caveat we add a link here. In general though because of the lost locality I fear the docs getting stale
[ci skip-rust] [ci skip-build-wheels]
[ci skip-rust] [ci skip-build-wheels]
[ci skip-rust] [ci skip-build-wheels]
[ci skip-rust] [ci skip-build-wheels]
[ci skip-build-wheels]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Only looked at test.py
. So far so good!)
|
[ci skip-build-wheels]
Add a separate field for `extra_output_prefix` for clarity [ci skip-rust] [ci skip-build-wheels]
[ci skip-rust] [ci skip-build-wheels]
This is currently only used by `pytest`, and we have a couple other fields with the same problem (de-duplicating between test runs) that we currently expect plugin implementers to handle. Incidentally, deleting the field flushed out some invalid `SubPartition`s built in tests that snuck through the previous cleanup commit. [ci skip-rust] [ci skip-build-wheels]
LGTM. @thejcannon can you please give a final sign off before merging? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will review tomorrow! 👍
@distinct_union_type_per_subclass(in_scope_types=[EnvironmentName]) | ||
@dataclass(frozen=True) | ||
@runtime_ignore_subscripts | ||
class SubPartition(Generic[_FieldSetT]): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After hacking on the batched/grouped pytest
implementation, I find myself not loving the schema of this type:
- Most plugins will (or at least currently) run with a single field-set per partition, so you end up with a lot of
asserts
onlen(field_sets)
at the beginning of plugin rules to ensure only 1 thing is passed in. Feels like something the type system should be checking for us. - In the
pytest
case I find myself writing a decent amount ofpartition.field_sets[0]
to get at fields likeextra_env_vars
, with the invariant that everyfield_set
in the partition has the same values for those fields. Again feels like something I wish I could model in the type system by defining a newclass
with atuple[Address, ...]
and then non-tuple fields for values that should be constant within a partition.
I am considering this refactor:
- Introduce a new generic type var for
_PartitionElementT
- Change
Partitions
toPartitions(FrozenDict[Any, _PartitionElementT])
- Change
SubPartition
toSubPartition(Generic[_PartitionElementT])
Then for plugins that test one-file-per-process, _PartitionElementT
= PluginTestFieldSet
. But in the pytest
case I can create whatever class I want to conveniently model the metadata.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(numbering your first 2 points)
For 1, don't forget that the types in question are registered via class lookup and instatiated using positional args. What actuallt gets union-registered and instantiated is technically opaque and can be type-punned. All this to say if the TestRequest
subclass defined its own SubPartition
, we'd register and use that. So we could easily provide a (ugly name nonwithstanding) SingleSubPartition
class which subclasses of TestResult
can declare as their SubPartition
classvar. Inside it it could do the asserting and provide "singular" types (e.g. request.field_set
)
For 2,for fmt
/lint
I put those in the partition key which is provided to SubPartition
.
I'm thinking more and more we should land the "one generic partition schema" change before this one. We can agree on the "generic" impl of these things there.
Thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would prefer not to block this on the generic design, both because I want to keep this moving forward and because I generally find more success refactoring code into a generic form if there are multiple sufficiently-different examples to pull from (vs. designing the generic thing first based on 1 (or a few very similar) examples, and then finding all the spots you missed when you try to extend it to a slightly-different use case).
That said I'm not opposed to letting this PR hang out a bit longer while I iterate on the APIs/schema. I'll take another pass at using key
as the mechanism for passing along the values that are identical across the batch.
I'm also wondering if we can/should be more flexible about forcing plugins to declare all the partitioning boilerplate if all they want to do is test 1 field-set per process. Could we instead say:
- For one input per process, define
@rule(TestFieldSetSubclass) -> TestResult
- For grouped testing, define
@rule(TestRequestSubclass.PartitionRequest) -> Partitions
and@rule(TestRequestSubclass.SubPartition) -> TestResult
And then have the coretest
logic decide which rules to invoke based on whichTestFieldSubclass
instances are covered by the registeredTestRequestSubclass.field_set_type
s?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this case I think we can use the unions and user-declarations of types to do the legwork for us. In fact we're basically "there" in terms of what you need in fmt/lint.
For lint/fmt I'm about to PR having the core type provide an opt-in/out "default" partitioner rule making one giant partition. You want the same thing, but flipped (N partitions of one element). So that handles the rule boilerplate.
Then there's the single-v-many boilerplate. I think given the above, we can easily play type-punny games such that the single rule users define is TestRequestSubclass.SingleFieldSetSubPartition -> TestResult
(name for exposition).
That way the core code is blissfully generic, but customization and default-ization is possible. I can also demo if this is all greek 😛
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would prefer not to block this on the generic design, both because I want to keep this moving forward and because I generally find more success refactoring code into a generic form if there are multiple sufficiently-different examples to pull from (vs. designing the generic thing first based on 1 (or a few very similar) examples, and then finding all the spots you missed when you try to extend it to a slightly-different use case).
+1 to not blocking. A big difference with test
is we only currently support target-aware implementations, and no one is (yet) asking for target-less test
, which would lose benefits like precise caching; we'd have to copy the whole repo into the sandbox!
Sorry, haven't had a chance to go deep into this yet, will try to get to it today. But one question - I notice the batching functionality is very specific to test. Yet we have similar functionality for lint/fmt, and it could be useful in other cases (e.g., dep inference). So I'm wondering if and how to generalize? |
IMO: #16967 ... but I don't want to block @danxmoran's progress here, since I don't expect that the APIs for any of |
I worry that trying to squeeze everything into the lint/fmt API structure at this stage will be more pain than gain - in the past I've had more luck with that sort of refactor if I have a few concrete, sufficiently-different examples to pull from. That said, @thejcannon already has some ideas for a more uniform structure + helpers to avoid boilerplate for plugin authors, so maybe I will soon be proven wrong 😄 Unified with lint/fmt or not, based on the diff in this PR and working ahead on the |
To tack on, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for doing this heavy lifting!
I had a couple of quick questions/comments I wanted to throw out sooner than I can do a full deep-dive review.
@@ -481,28 +625,49 @@ class TestExtraEnvVarsField(StringSequenceField, metaclass=ABCMeta): | |||
) | |||
|
|||
|
|||
@rule_helper | |||
async def _get_partitions_by_request_type( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is the return type dict's value a list[Partitions]
and not just a Partitions
? AFAICT from the implementation it will always contain exactly one Partitions
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I ask because the multiple levels of plurality are a mite confusing...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If so then right now those concepts seem mingled. E.g., partition_per_input()
is on Partition, not SubPartition.
We had discussed the terminology on Slack I believe? Should that be introduced here?
I do think it makes sense to have separate rules for the two different concepts.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The terminology discussion on Slack was mainly around what field names to expose to end users, which (purposefully) isn't touched in this PR - the intent here is to get the generic plumbing in place.
One of my take-aways from this PR is that it's hard to effectively design/review generic plumbing without a concrete implementation to look at 😅
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Gotcha. The terminology, and functionality, for plugin authors and end users should probably be consistent though?
Right now it's not clear to me where the logic for sub-batching-by-size would go, for example.
@distinct_union_type_per_subclass(in_scope_types=[EnvironmentName]) | ||
@dataclass(frozen=True) | ||
@runtime_ignore_subscripts | ||
class SubPartition(Generic[_FieldSetT]): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Terminology-wise, I'm not clear on the distinction between a Partition and a SubPartition. Can you elaborate in the comments here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume that's stolen from the Lint changes to some degree (which doesn't have a Partition
object, but might if we align).
The reason I named it SubPartition
over there was because I wanted to convey to the plugin author that the partition they gave us can (and likely will) be sliced into sub-partitions. Type-wise they're identical.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. So the idea is that Partition captures "tests that are allowed to run together" and SubPartition captures "tests that will run together, based on some batch size"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is how it started, yes. Then I removed the top-level batch size limit so now they are effectively the same thing - I've since decided that the top-level limit probably needs re-added but I haven't gotten around to the implementation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It doesn't help I keep chiming in here and in Slack with the gospel of re-use between this and fmt/lint.
I've opened #17134 as a "clean slate" redo of this with hopefully less boilerplate / migration pain, based on @thejcannon's recent work. Closing this one. |
Pre-work for #14941
The new structure is heavily inspired by @thejcannon's recent work to refactor the way
lint
andfmt
batch inputs. I don't expect any change in end behavior from this diff - all plugins providing atest
implementation are coded to return single-element partitions. Once the plumbing has merged tomain
I plan to follow-up with an implementation ofpytest
batching.[ci skip-rust]
[ci skip-build-wheels]