[TorchAcc] Update padding strategy when using persistent cache #2464

eedalong · 2024-11-18T02:38:09Z

PR type

Bug Fix
New Feature
Document Updates
More Models or Datasets Support

PR information

Optimize padding strategy when persistent cache is enabled, so we can enjoy performance boost with little extra compilaiton.

Experiment results

Around 10% e2e performance improvement for TorchAcc backend.

anw90 · 2024-11-18T02:46:03Z

swift/torchacc_utils.py

@@ -30,6 +30,15 @@ def get_bucket_sizes(max_length: int) -> List[int]:
    if os.getenv('TORCHACC_DATA_BUCKETS') is not None:
        bucket_sizes = [int(x) for x in os.getenv('TORCHACC_DATA_BUCKETS').split(',')]
        bucket_sizes.append(max_length)
+    elif os.getenv('TORCHACC_CACHE_PATH') is not None: # padding strategy when persistent cache is enabled
+        p = 1.4


Maybe we could add an environment variable for the p.

anw90 · 2024-11-18T02:49:06Z

swift/torchacc_utils.py

@@ -30,6 +30,15 @@ def get_bucket_sizes(max_length: int) -> List[int]:
    if os.getenv('TORCHACC_DATA_BUCKETS') is not None:
        bucket_sizes = [int(x) for x in os.getenv('TORCHACC_DATA_BUCKETS').split(',')]
        bucket_sizes.append(max_length)
+    elif os.getenv('TORCHACC_CACHE_PATH') is not None: # padding strategy when persistent cache is enabled


Could we replace the else block with this logic as defualt bucketing strategy?

anw90

LGTM

…actor3 * commit '2bbc325ca789592197d2004bb0ffc47cc39c0317': (140 commits) fix fix update safe_ddp_context fix fix update row_processor support glm-edge & glm-edge-v (#2526) fix open-o1 support qwq-32b-preview (#2520) support mPLUG-Owl3 241101 (#2515) fix latex-ocr (#2510) support batch flattening collator (#2499) fix eval_dataset no (#2497) Support marco o1 (#2496) Fix preprocess num proc (#2492) fix awq quant device_map (#2488) Update Common QA (#2475) fix kto (#2478) update padding strategy for persistent cache (#2464) fix qwen2vl pt infer (#2463) ... # Conflicts: # docs/source/Instruction/命令行参数.md # docs/source/LLM/人类偏好对齐训练文档.md # docs/source/Multi-Modal/index.md # docs/source/Multi-Modal/qwen2-vl最佳实践.md # docs/source/Multi-Modal/人类偏好对齐训练文档.md # docs/source_en/Instruction/Command-line-parameters.md # docs/source_en/Instruction/Common-QA.md # docs/source_en/LLM/Human-Preference-Alignment-Training-Documentation.md # docs/source_en/Multi-Modal/qwen2-vl-best-practice.md

anw90 reviewed Nov 18, 2024

View reviewed changes

eedalong force-pushed the optimize_pad branch from 65df450 to 7b22094 Compare November 18, 2024 03:04

Jintao-Huang approved these changes Nov 18, 2024

View reviewed changes

eedalong force-pushed the optimize_pad branch 2 times, most recently from 8f81c68 to ebf50a2 Compare November 18, 2024 03:31

eedalong requested review from anw90 and Jintao-Huang November 18, 2024 05:59

anw90 reviewed Nov 18, 2024

View reviewed changes

anw90 approved these changes Nov 18, 2024

View reviewed changes

update padding strategy for persistent cache

18c1674

eedalong force-pushed the optimize_pad branch from ebf50a2 to 18c1674 Compare November 18, 2024 06:55

Jintao-Huang merged commit 5521907 into modelscope:main Nov 18, 2024
2 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[TorchAcc] Update padding strategy when using persistent cache #2464

[TorchAcc] Update padding strategy when using persistent cache #2464

eedalong commented Nov 18, 2024

anw90 Nov 18, 2024

eedalong Nov 18, 2024

anw90 Nov 18, 2024

eedalong Nov 18, 2024

anw90 left a comment

[TorchAcc] Update padding strategy when using persistent cache #2464

[TorchAcc] Update padding strategy when using persistent cache #2464

Conversation

eedalong commented Nov 18, 2024

PR type

PR information

Experiment results

anw90 Nov 18, 2024

Choose a reason for hiding this comment

eedalong Nov 18, 2024

Choose a reason for hiding this comment

anw90 Nov 18, 2024

Choose a reason for hiding this comment

eedalong Nov 18, 2024

Choose a reason for hiding this comment

anw90 left a comment

Choose a reason for hiding this comment