Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature Request]: Support logging MFU as a metrics #2100

Open
xingyaoww opened this issue Dec 2, 2024 · 0 comments
Open

[Feature Request]: Support logging MFU as a metrics #2100

xingyaoww opened this issue Dec 2, 2024 · 0 comments

Comments

@xingyaoww
Copy link
Contributor

It would be helpful if torchtune can log MFU as a metric to measure the real GPU utilization.

xingyaoww pushed a commit to xingyaoww/torchtune that referenced this issue Dec 3, 2024
- Add MFU utility module with functions to calculate:
  - Theoretical peak FLOPS for GPU
  - Actual MFU percentage
  - Model FLOPs for one forward pass
- Modify LoRAFinetuneRecipeDistributed to:
  - Calculate model FLOPs after initialization
  - Log MFU alongside other metrics in training loop
- Add fvcore as a dev dependency for FLOPs calculation

Closes pytorch#2100
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
1 participant