-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support llama3 #64
Support llama3 #64
Conversation
Thanks for adding llama3 support! Accuracy is critical, can you share the output result from both llama2 and llama3 from run_interactive? |
This is the output for both the models: https://gist.github.com/bhavya01/40a344e671a2e5dde980f163141545db |
Can you use the output without this PR as base to compare (it's hard to know the quality dropped or not without baseline comparison)? If possible, can you do both base and test without quantization? |
Yes makes sense. I did the comparison for LLAMA2 without quantization and both the results seem weird. https://gist.github.com/bhavya01/660cd636d678f42a01501d093d63c2b1 With quantization, they both look pretty similar: I added llama2_before output in this gist: https://gist.github.com/bhavya01/40a344e671a2e5dde980f163141545db |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for adding base vs test comparison for bfloat16 and int8 quantization. Looks good to me now.
Please fixed the checks errors and feel free to merge after that.
The unit tests are failing as we test against Jetstream v0.2.0. We should have a new Jetstream release this week after which these tests will be fixed. |
@JoeZijunZhou HI Zijun, could you let us know when you plan to create a new Jetstream release? @bhavya01 Giving current test status, we need to tag latest JetStream release before submit this PR. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @bhavya01 and @FanhaiLu1 ! Here is the release for JetStream: AI-Hypercomputer/JetStream#72
Tested with run_interactive.py
Also ran the benchmark for llama-2 on tpu v4-8 and got the following numbers:
Need to run the benchmark script in Jetstream repo to get the metrics for llama-3