-
Notifications
You must be signed in to change notification settings - Fork 55
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix bugs with HostIr integration into FusionExecutorCache #4136
base: main
Are you sure you want to change the base?
Conversation
3ef189d
to
7c7b48e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's great to hear many tests are passing behind the scene. Is it possible to add some tests to this PR? For example, it could be some existing tests but with host IR lowering also enabled (cf. TEST_P) or could be some new tests to test_host_ir_integration to catch potential regression?
58d6217
to
1381a3c
Compare
Updated with TEST_P on tests in |
39e427a
to
d62498b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
01bf472
to
bc886fd
Compare
@wujingyue I removed the host ir lowering code and replaced it with a TODO. I also updated the tests to just pass false for enable hostir lowering. The next PR i can re-enable them, and re-add the hostir lowering code. |
bc886fd
to
f08db67
Compare
!test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM with comments
public testing::WithParamInterface<InOutMesh> {}; | ||
class LowerGatherTest | ||
: public MultiDeviceTest, | ||
public testing::WithParamInterface<std::tuple<InOutMesh, bool>> {}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
public testing::WithParamInterface<std::tuple<InOutMesh, bool>> {}; | |
public testing::WithParamInterface<std::tuple<DeviceMesh, DeviceMesh, bool>> {}; |
I don't see a good reason for nesting. This will probably simplify the structured bindings below.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is here because we want certain pairs of DeviceMesh, and not every pair that Combine(InMesh, OutMesh) would give
5983947
to
36c8804
Compare
Co-authored-by: samnordmann <[email protected]>
c3a186e
to
b9abb73
Compare
This PR fixes some bugs with HostIr integration into FusionExecutorCache. The list of fixes are:
Expressions within the SegmentedGroup were out of order. They get sorted during gpu lowering. So, I topologically sorted them after adding them to the HostIrContainer
CompileFusionParallel launches compileKernel with multiple threads from a pool. However, the main thread does not wait on completion before continuing. I added fix so that it will wait. (Edit: Jingyue pointed out the wait was there already. I moved the HostIr additions from previous PRs to after the wait)
During the preseg pass stage, Sets are added between expressions with a different sharding. These sets needed lowering into a communication. I added the missing lowering. (Edit: this is going into a separate PR)
With these changes, many (most?) of the MultiDevice tests and also DistributedTransformer tests are passing. The failing tests seem to be because of an issue with loop splitting. Here is an example:
After cloning exprs into the hostir container, splits are not propagated.