Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support computing parameter count in ModelSummary for FSDP models #20151

Closed
awaelchli opened this issue Aug 2, 2024 · 0 comments · Fixed by #20163
Closed

Support computing parameter count in ModelSummary for FSDP models #20151

awaelchli opened this issue Aug 2, 2024 · 0 comments · Fixed by #20163
Labels
callback: model summary feature Is an improvement or enhancement strategy: fsdp Fully Sharded Data Parallel
Milestone

Comments

@awaelchli
Copy link
Contributor

awaelchli commented Aug 2, 2024

Description & Motivation

Models that are set up with FSDP (or DTensor) do not show the total parameter count in the ModelSummary.

Pitch

Compute the shapes correctly (similar to the DeepSpeed summary).

Alternatives

No response

Additional context

No response

cc @Borda @awaelchli @carmocca

@awaelchli awaelchli added feature Is an improvement or enhancement needs triage Waiting to be triaged by maintainers callback: model summary strategy: fsdp Fully Sharded Data Parallel and removed needs triage Waiting to be triaged by maintainers labels Aug 2, 2024
@awaelchli awaelchli added this to the future milestone Aug 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
callback: model summary feature Is an improvement or enhancement strategy: fsdp Fully Sharded Data Parallel
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant