Adding memory and graph stats (#156) by jaygala223 · Pull Request #1858 · huggingface/optimum-habana

jaygala223 · 2025-03-18T02:48:22Z

Add memory, graph stats
fix import formatting issues
sort imports
sort imports

What does this PR do?

Prints stats like graph compilation duration, num graphs, and memory at the end of the run

* Add memory, graph stats * fix import formatting issues * sort imports * sort imports

vidyasiv

Thanks for your PR @jaygala223 !
Could you run inference 1 HPU and 8 HPU from https://github.com/huggingface/optimum-habana/tree/main/examples/image-to-text#inference-with-mixed-precision-bf16 with your changes and paste the results please?

jaygala223 · 2025-03-21T05:35:57Z

Hi @vidyasiv, thanks for reviewing my PR. I will attach the screenshots and address your comments.

jaygala223 · 2025-03-21T06:32:50Z

Here is a screenshot of what it looks like

vidyasiv · 2025-03-28T17:31:47Z

1 HPU README testing

python3 examples/image-to-text/run_pipeline.py \
    --model_name_or_path meta-llama/Llama-3.2-11B-Vision-Instruct \
    --use_hpu_graphs \
    --bf16 \
    --sdp_on_bf16

Output

============================= HABANA PT BRIDGE CONFIGURATION =========================== 
 PT_HPU_LAZY_MODE = 1
 PT_HPU_RECIPE_CACHE_CONFIG = ,false,1024
 PT_HPU_MAX_COMPOUND_OP_SIZE = 9223372036854775807
 PT_HPU_LAZY_ACC_PAR_MODE = 1
 PT_HPU_ENABLE_REFINE_DYNAMIC_SHAPES = 0
 PT_HPU_EAGER_PIPELINE_ENABLE = 1
 PT_HPU_EAGER_COLLECTIVE_PIPELINE_ENABLE = 1
 PT_HPU_ENABLE_LAZY_COLLECTIVES = 0
---------------------------: System Configuration :---------------------------
Num CPU Cores : 160
CPU RAM       : 1056398140 KB
------------------------------------------------------------------------------
The model 'GaudiMllamaForConditionalGeneration' is not supported for image-to-text. Supported models are ['BlipForConditionalGeneration', 'Blip2ForConditionalGeneration', 'ChameleonForConditionalGeneration', 'GitForCausalLM', 'Idefics2ForConditionalGeneration', 'InstructBlipForConditionalGeneration', 'InstructBlipVideoForConditionalGeneration', 'Kosmos2ForConditionalGeneration', 'LlavaForConditionalGeneration', 'LlavaNextForConditionalGeneration', 'LlavaNextVideoForConditionalGeneration', 'LlavaOnevisionForConditionalGeneration', 'MllamaForConditionalGeneration', 'PaliGemmaForConditionalGeneration', 'Pix2StructForConditionalGeneration', 'Qwen2VLForConditionalGeneration', 'VideoLlavaForConditionalGeneration', 'VipLlavaForConditionalGeneration', 'VisionEncoderDecoderModel'].
/usr/local/lib/python3.10/dist-packages/transformers/generation/configuration_utils.py:601: UserWarning: `do_sample` is set to `False`. However, `temperature` is set to `0.6` -- this flag is only used in sample-based generation modes. You should set `do_sample=True` or unset `temperature`.
  warnings.warn(
/usr/local/lib/python3.10/dist-packages/transformers/generation/configuration_utils.py:606: UserWarning: `do_sample` is set to `False`. However, `top_p` is set to `0.9` -- this flag is only used in sample-based generation modes. You should set `do_sample=True` or unset `top_p`.
  warnings.warn(
03/28/2025 17:09:19 - INFO - __main__ - result = [[{'generated_text': 'user\n\nWhat is shown in this image?assistant\n\nThe image depicts a serene lake scene, featuring a long wooden dock extending into the water, surrounded by lush trees and mountains in the background. The dock is made of weathered wooden planks and stretches out into the calm, reflective water, creating a sense of depth and tranquility. The surrounding landscape is characterized by dense green trees and rolling hills, with a majestic mountain range visible in the distance. The sky above is overcast, adding to the peaceful ambiance of the scene. Overall, the image'}]]
03/28/2025 17:09:19 - INFO - __main__ - time = 2097.766735998448ms, Throughput (including tokenization) = 44.33286046731785 tokens/second

Stats:
--------------------------------------------------------------------------------------------------------------

Throughput (including tokenization) = 44.33286046731785 tokens/second
Number of HPU graphs                = 0
Memory allocated                    = 23.42 GB
Max memory allocated                = 23.45 GB
Total memory available              = 94.62 GB
Graph compilation duration          = 10.795492552977521 seconds
--------------------------------------------------------------------------------------------------------------

8 HPU README testing

PT_HPU_ENABLE_LAZY_COLLECTIVES=true python examples/gaudi_spawn.py --use_deepspeed --world_size 8 examples/image-to-text/run_pipeline.py \
    --model_name_or_path meta-llama/Llama-3.2-90B-Vision-Instruct \
    --image_path "https://llava-vl.github.io/static/images/view.jpg" \
    --use_hpu_graphs \
    --bf16 \
    --use_flash_attention \
    --flash_attention_recompute

Output

<snip>
============================= HABANA PT BRIDGE CONFIGURATION =========================== 
 PT_HPU_LAZY_MODE = 1
 PT_HPU_RECIPE_CACHE_CONFIG = ,false,1024
 PT_HPU_MAX_COMPOUND_OP_SIZE = 9223372036854775807
 PT_HPU_LAZY_ACC_PAR_MODE = 0
 PT_HPU_ENABLE_REFINE_DYNAMIC_SHAPES = 0
 PT_HPU_EAGER_PIPELINE_ENABLE = 1
 PT_HPU_EAGER_COLLECTIVE_PIPELINE_ENABLE = 1
 PT_HPU_ENABLE_LAZY_COLLECTIVES = 1
---------------------------: System Configuration :---------------------------
Num CPU Cores : 160
CPU RAM       : 1056398140 KB
------------------------------------------------------------------------------

<snip>
The model 'GaudiMllamaForConditionalGeneration' is not supported for image-to-text. Supported models are ['BlipForConditionalGeneration', 'Blip2ForConditionalGeneration', 'ChameleonForConditionalGeneration', 'GitForCausalLM', 'Idefics2ForConditionalGeneration', 'InstructBlipForConditionalGeneration', 'InstructBlipVideoForConditionalGeneration', 'Kosmos2ForConditionalGeneration', 'LlavaForConditionalGeneration', 'LlavaNextForConditionalGeneration', 'LlavaNextVideoForConditionalGeneration', 'LlavaOnevisionForConditionalGeneration', 'MllamaForConditionalGeneration', 'PaliGemmaForConditionalGeneration', 'Pix2StructForConditionalGeneration', 'Qwen2VLForConditionalGeneration', 'VideoLlavaForConditionalGeneration', 'VipLlavaForConditionalGeneration', 'VisionEncoderDecoderModel'].
The model 'GaudiMllamaForConditionalGeneration' is not supported for image-to-text. Supported models are ['BlipForConditionalGeneration', 'Blip2ForConditionalGeneration', 'ChameleonForConditionalGeneration', 'GitForCausalLM', 'Idefics2ForConditionalGeneration', 'InstructBlipForConditionalGeneration', 'InstructBlipVideoForConditionalGeneration', 'Kosmos2ForConditionalGeneration', 'LlavaForConditionalGeneration', 'LlavaNextForConditionalGeneration', 'LlavaNextVideoForConditionalGeneration', 'LlavaOnevisionForConditionalGeneration', 'MllamaForConditionalGeneration', 'PaliGemmaForConditionalGeneration', 'Pix2StructForConditionalGeneration', 'Qwen2VLForConditionalGeneration', 'VideoLlavaForConditionalGeneration', 'VipLlavaForConditionalGeneration', 'VisionEncoderDecoderModel'].
<snip>

/usr/local/lib/python3.10/dist-packages/transformers/generation/configuration_utils.py:606: UserWarning: `do_sample` is set to `False`. However, `top_p` is set to `0.9` -- this flag is only used in sample-based generation modes. You should set `do_sample=True` or unset `top_p`.
  warnings.warn(
....
<snip>
03/28/2025 17:24:19 - INFO - __main__ - result = [[{'generated_text': 'user\n\nWhat is shown in this image?assistant\n\nThe image depicts a serene lake scene, with a wooden dock extending into the water. The dock is made of light-colored wood and features a railing on either side, although it appears to be missing in some areas. It stretches out from the foreground towards the background, where it meets a larger platform or dock.\n\nIn the background, there are trees lining the shore, and a mountain range can be seen in the distance. The sky above is overcast, with clouds covering most of the sun. The'}]]
03/28/2025 17:24:19 - INFO - __main__ - time = 4694.892791402526ms, Throughput (including tokenization) = 19.8087590349891 tokens/second

Stats:
-------------------------------------------------------------------------------------------------------------

Throughput (including tokenization) = 19.8087590349891 tokens/second
Number of HPU graphs                = 0
Memory allocated                    = 27.34 GB
Max memory allocated                = 28.22 GB
Total memory available              = 94.62 GB
Graph compilation duration          = 20.29410206899047 seconds
-------------------------------------------------------------------------------------------------------------

vidyasiv

@regisss please review if output is acceptable.

libinta · 2025-04-16T16:21:02Z

+    stats = ""
+    stats = stats + f"\nThroughput (including tokenization) = {throughput} tokens/second"
+    stats = stats + f"\nNumber of HPU graphs                = {count_hpu_graphs()}"
+    separator = "-" * len(stats)


@jaygala223 why don't we use https://docs.habana.ai/en/latest/PyTorch/Reference/Python_Packages.html?highlight=memory%20stat#metric-apis?

Hi @libinta, thanks for the review. I have not used this before and for this PR I took reference from for the following:

optimum-habana/examples/text-generation/run_generation.py

Line 805 in 231e923

stats = f"Throughput (including tokenization) = {throughput} tokens/second"

regisss · 2025-04-17T15:52:31Z

@jaygala223 Can you merge the main branch into yours to make sure it's up to date please? The doc build workflow failed because of this

jaygala223 · 2025-04-17T15:55:23Z

Sure

HuggingFaceDocBuilderDev · 2025-04-17T16:25:53Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

regisss

LGTM

Adding memory and graph stats (#156)

4acbee4

* Add memory, graph stats * fix import formatting issues * sort imports * sort imports

jaygala223 requested a review from regisss as a code owner March 18, 2025 02:48

libinta added the synapse 1.21 label Mar 19, 2025

vidyasiv suggested changes Mar 20, 2025

View reviewed changes

Comment thread examples/image-to-text/run_pipeline.py

Comment thread examples/image-to-text/run_pipeline.py Outdated

remove unnecessary if check

f5181e5

vidyasiv approved these changes Mar 28, 2025

View reviewed changes

libinta reviewed Apr 16, 2025

View reviewed changes

Merge branch 'huggingface:main' into auto-pr-1a05f01

8bc9fc5

regisss approved these changes Apr 17, 2025

View reviewed changes

regisss merged commit 2e30261 into huggingface:main Apr 17, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adding memory and graph stats (#156)#1858

Adding memory and graph stats (#156)#1858
regisss merged 3 commits into
huggingface:mainfrom
HabanaAI:auto-pr-1a05f01

jaygala223 commented Mar 18, 2025

Uh oh!

vidyasiv left a comment

Uh oh!

Uh oh!

Uh oh!

jaygala223 commented Mar 21, 2025 •

edited

Loading

Uh oh!

jaygala223 commented Mar 21, 2025

Uh oh!

vidyasiv commented Mar 28, 2025

Uh oh!

vidyasiv left a comment

Uh oh!

libinta Apr 16, 2025

Uh oh!

jaygala223 Apr 16, 2025 •

edited

Loading

Uh oh!

regisss commented Apr 17, 2025

Uh oh!

jaygala223 commented Apr 17, 2025

Uh oh!

HuggingFaceDocBuilderDev commented Apr 17, 2025

Uh oh!

regisss left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Conversation

jaygala223 commented Mar 18, 2025

What does this PR do?

Uh oh!

vidyasiv left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

jaygala223 commented Mar 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jaygala223 commented Mar 21, 2025

Uh oh!

vidyasiv commented Mar 28, 2025

1 HPU README testing

8 HPU README testing

Uh oh!

vidyasiv left a comment

Choose a reason for hiding this comment

Uh oh!

libinta Apr 16, 2025

Choose a reason for hiding this comment

Uh oh!

jaygala223 Apr 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

regisss commented Apr 17, 2025

Uh oh!

jaygala223 commented Apr 17, 2025

Uh oh!

HuggingFaceDocBuilderDev commented Apr 17, 2025

Uh oh!

regisss left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

jaygala223 commented Mar 21, 2025 •

edited

Loading

jaygala223 Apr 16, 2025 •

edited

Loading