Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add MFU (Model FLOPs Utilization) logging #2

Closed
wants to merge 0 commits into from
Closed

Conversation

xingyaoww
Copy link
Owner

This PR implements MFU (Model FLOPs Utilization) logging as requested in pytorch#2100.

Changes

  1. Added MFU utility module (torchtune/utils/mfu.py) with functions to:

    • Calculate theoretical peak FLOPS for GPU
    • Calculate actual MFU percentage
    • Calculate model FLOPs for one forward pass
  2. Modified LoRAFinetuneRecipeDistributed to:

    • Calculate model FLOPs after initialization
    • Log MFU alongside other metrics in training loop

The MFU metric is now logged with the same frequency as other metrics (controlled by log_every_n_steps) and is available in all supported logging backends (terminal, disk, WandB, TensorBoard).

Closes pytorch#2100

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Feature Request]: Support logging MFU as a metrics
1 participant