Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

compute quantiles for memory usage #187

Merged
merged 1 commit into from
Sep 1, 2024

Conversation

kvignesh1420
Copy link
Collaborator

@kvignesh1420 kvignesh1420 commented Sep 1, 2024

Summary

This PR adds quantile computations for memory usage during benchmarking.

Details

Currently, the _test_memory function only return the mean memory usage and does not return quantile data. This PR adds the necessary functionality and keeps it in sync with the speed based stats that triton benchmarks compute.

Env:

torch==2.4.0+cu118
triton==3.0.0

Testing Done

  • Hardware Type: A100-80G-PCIe
  • run make test to ensure correctness
  • run make checkstyle to ensure code style
  • run make test-convergence to ensure convergence

@kvignesh1420 kvignesh1420 changed the title refactor quantile variable usage compute quantiles for memory usage Sep 1, 2024
update CE and swiglu benchmarks

add embedding benchmark results

add flce benchmarks

add geglu benchmarks

add layernorm benchmarks

add rms norm benchmarks

add rope benchmarks

lint fixes
@kvignesh1420 kvignesh1420 marked this pull request as ready for review September 1, 2024 20:21
Copy link
Collaborator

@ByronHsu ByronHsu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great! But i believe memory usage doesn't often oscillate

@kvignesh1420 kvignesh1420 merged commit e8f9d08 into linkedin:main Sep 1, 2024
2 checks passed
@kvignesh1420 kvignesh1420 deleted the add-quartile-mem-data branch September 1, 2024 21:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants