Skip to content

Conversation

@IvanKobzarev
Copy link
Contributor

@IvanKobzarev IvanKobzarev commented Oct 6, 2025

Stacked PRs:


[asynctp] Optimize agmm lastdim via addmm_

@meta-cla meta-cla bot added the CLA Signed This label is managed by the Meta Open Source bot. label Oct 6, 2025
@IvanKobzarev IvanKobzarev mentioned this pull request Oct 6, 2025
@IvanKobzarev IvanKobzarev requested a review from fmassa October 6, 2025 08:39
stack-info: PR: #190, branch: IvanKobzarev/stack/7
Copy link
Contributor

@fmassa fmassa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have all the context on this file yet, but changes LGTM in general.

outputs[idx] += output_partials[idx]
out = outputs[idx]
if first:
torch.ops.aten.mm.out(shard, B_shards[idx][rank], **kwargs, out=out)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we prefer using the torch.mm version instead of the torch.ops.aten.mm version? I'm not sure there is effectively a difference, but maybe for consistency?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I think there should not be much difference, we can use torch.mm.

Copy link

@eellison eellison left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

test?

@IvanKobzarev IvanKobzarev merged commit bd31bea into main Oct 8, 2025
5 of 6 checks passed
@IvanKobzarev
Copy link
Contributor Author

IvanKobzarev commented Oct 8, 2025

test?

Oh, yeah, I want to add e2e test but on torchtitan/autoparallel with asynctp/bucketing/overlap configs once configs are landed pytorch/torchtitan#1838

@eellison
Copy link

eellison commented Oct 8, 2025

Yea - thought this was in pytorch repro at first / more stand alone.. less easy here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Meta Open Source bot.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants