Enable fused rmsnorm in bf16 for llama by puneeshkhanna · Pull Request #621 · huggingface/optimum-habana

puneeshkhanna · 2024-01-03T10:16:56Z

What does this PR do?

Fixes # (issue)

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you make sure to update the documentation with your changes?
Did you write any new necessary tests?

puneeshkhanna · 2024-01-03T10:19:54Z

@regisss - please review. We can enable fused rmsnorm in lower precision too and this gives a boost in performance too.

Command -> python ../gaudi_spawn.py --use_deepspeed --world_size 8 run_generation.py --model_name_or_path /software/data/llama_inference/Llama-2-70b-hf/ --max_new_tokens ?? --bf16 --n_iterations 3 --use_hpu_graphs --use_kv_cache --batch_size ?? --reuse_cache --limit_hpu_graphs --trim_logits --warmup 2 --attn_softmax_bf16

See below table for improved perf results:

70B 8x	BS	Max new tokens	Default Perf	Perf with rmsnorm fix	% Improvement over default perf
	1	100	55.7	57.995	3.68
	40	100	1893.22	1969.95	3.5
	1	2048	60.53	62.19	2.58
	40	2048	1686.35	1726.47	2.5
	60	2048	2207.2	2256.79	2.2
	1	4096	59.825	61.4	2.48
	40	4096	1366.72	1393.36	1.95
	60	4096	1688.21	1716.64	1.74
7B 1x	1	4096	124.147	126.8	2.15
	4	4096	354.48	357.28	0.8
13B 1x	1	4096	68.86	69.79	1.37
	4	4096	203.77	204.89	0.56

bgoldberg-habana · 2024-01-03T10:23:52Z

LGTM, was also verified in FP8 runs.

HuggingFaceDocBuilderDev · 2024-01-03T10:26:20Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

regisss

Nice! Does it generate the same outputs as before?

mandy-li · 2024-01-03T15:08:57Z

@regisss , let me check finetuning as well for perf and accuracy.

puneeshkhanna · 2024-01-04T04:29:24Z

@mandy-li - thanks for checking finetuning. I guess there were no issues there too. Did perf also improve in finetuning too ?
@regisss - thanks for merging.

This reverts commit b72d8ea.

Enable fused rmsnorm in bf16

9217a4d

puneeshkhanna requested review from libinta and mandy-li as code owners January 3, 2024 10:16

puneeshkhanna requested a review from a user January 3, 2024 10:16

regisss added the run-test Run CI for PRs from external contributors label Jan 3, 2024

puneeshkhanna changed the title ~~Enable fused rmsnorm in bf16~~ Enable fused rmsnorm in bf16 for llama Jan 3, 2024

bgoldberg-habana self-requested a review January 3, 2024 10:24

bgoldberg-habana approved these changes Jan 3, 2024

View reviewed changes

ghost approved these changes Jan 3, 2024

View reviewed changes

regisss approved these changes Jan 3, 2024

View reviewed changes

mandy-li approved these changes Jan 3, 2024

View reviewed changes

regisss merged commit b72d8ea into huggingface:main Jan 3, 2024

puneeshkhanna deleted the rmsnorm_bf16 branch January 4, 2024 04:29

MrGeva pushed a commit to HabanaAI/optimum-habana-fork that referenced this pull request Feb 4, 2024

Revert "Enable fused rmsnorm in bf16 for llama (huggingface#621)"

c6f086d

This reverts commit b72d8ea.

jychen21 pushed a commit to jychen21/optimum-habana that referenced this pull request Feb 27, 2024

Enable fused rmsnorm in bf16 for llama (huggingface#621)

9ab63ce

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enable fused rmsnorm in bf16 for llama#621

Enable fused rmsnorm in bf16 for llama#621
regisss merged 1 commit into
huggingface:mainfrom
puneeshkhanna:rmsnorm_bf16

puneeshkhanna commented Jan 3, 2024

Uh oh!

puneeshkhanna commented Jan 3, 2024 •

edited

Loading

Uh oh!

bgoldberg-habana commented Jan 3, 2024

Uh oh!

HuggingFaceDocBuilderDev commented Jan 3, 2024

Uh oh!

regisss left a comment

Uh oh!

mandy-li commented Jan 3, 2024

Uh oh!

puneeshkhanna commented Jan 4, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Conversation

puneeshkhanna commented Jan 3, 2024

What does this PR do?

Before submitting

Uh oh!

puneeshkhanna commented Jan 3, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

bgoldberg-habana commented Jan 3, 2024

Uh oh!

HuggingFaceDocBuilderDev commented Jan 3, 2024

Uh oh!

regisss left a comment

Choose a reason for hiding this comment

Uh oh!

mandy-li commented Jan 3, 2024

Uh oh!

puneeshkhanna commented Jan 4, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

puneeshkhanna commented Jan 3, 2024 •

edited

Loading