Skip to content

Onboarding LLAMA3 70B LoRa to B300 and B200 chips#2397

Merged
malay-nagda merged 8 commits intomainfrom
rmukundan/onboard-llama3-lora-b200-b300
Feb 27, 2026
Merged

Onboarding LLAMA3 70B LoRa to B300 and B200 chips#2397
malay-nagda merged 8 commits intomainfrom
rmukundan/onboard-llama3-lora-b200-b300

Conversation

@rhmukundan
Copy link
Copy Markdown
Contributor

@rhmukundan rhmukundan commented Feb 16, 2026

Summary by CodeRabbit

New Features

  • Added Llama3 70B LoRA configuration support for B200 and B300 GPUs
  • Available precision variants: BF16, FP8 Compressed Scaling, and FP8 MX formats
  • Configurations include optimized parallelism and transformer engine settings

Signed-off-by: Raghav Hrishikeshan Mukundan <rmukundan@nvidia.com>
@copy-pr-bot
Copy link
Copy Markdown

copy-pr-bot bot commented Feb 16, 2026

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@rhmukundan rhmukundan self-assigned this Feb 16, 2026
@rhmukundan rhmukundan requested a review from erhoo82 February 17, 2026 00:13
@rhmukundan rhmukundan marked this pull request as ready for review February 17, 2026 00:13
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Feb 17, 2026

📝 Walkthrough

Walkthrough

Adds B200 and B300 GPU variant support for Llama3 70B LoRA fine-tuning configurations. Introduces two new configuration builder functions, base configuration constants with multiple precision variants (BF16, FP8 CS, FP8 MX), and corresponding public API exports across the configuration module hierarchy.

Changes

Cohort / File(s) Summary
Workload Base Configurations
scripts/performance/configs/llama/llama3_workload_base_configs.py
Adds private base configs (_LLAMA3_70B_LORA_CONFIG_B300, _LLAMA3_70B_LORA_CONFIG_B200) with LORA and transformer_engine settings, plus six public variants (three each for B300 and B200) covering BF16, FP8_CS, and FP8_MX precision types.
Finetune Configuration Functions
scripts/performance/configs/llama/llama3_llm_finetune.py
Implements two new configuration builder functions (llama3_70b_lora_config_b300, llama3_70b_lora_config_b200) that construct ConfigContainer objects with LORA settings, packed sequence support, and target_modules set to ["linear_qkv"].
Module Exports
scripts/performance/configs/llama/__init__.py
Exposes new B200 LoRA function and eight new public constants (B300 and B200 variants) to the public API surface when Megatron bridge is available.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

Suggested labels

performance

Suggested reviewers

  • erhoo82
  • malay-nagda
🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Test Results For Major Changes ⚠️ Warning PR introduces B200/B300 LoRA configurations without test results, performance benchmarks, or convergence validation, and B300 constants are missing from init.py imports. Include test results validating B200/B300 LoRA configurations and fix missing B300 constant imports in init.py file.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately reflects the main changes: adding Llama3 70B LoRA configurations for B300 and B200 chips across multiple configuration files.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Merge Conflict Detection ✅ Passed ✅ No merge conflicts detected when merging into main

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch rmukundan/onboard-llama3-lora-b200-b300

Tip

Issue Planner is now in beta. Read the docs and try it out! Share your feedback on Discord.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (3)
scripts/performance/configs/llama/__init__.py (3)

208-219: ⚠️ Potential issue | 🟠 Major

Missing B300 LoRA constants in __all__.

Only the B200 LoRA constants are added to __all__ (lines 212-214). The B300 LoRA constants should also be listed here for a complete public API surface.

Proposed fix
     "LLAMA3_70B_LORA_CONFIG_B200_BF16_V1",
     "LLAMA3_70B_LORA_CONFIG_B200_FP8_CS_V1",
     "LLAMA3_70B_LORA_CONFIG_B200_FP8_MX_V1",
+    "LLAMA3_70B_LORA_CONFIG_B300_BF16_V1",
+    "LLAMA3_70B_LORA_CONFIG_B300_FP8_CS_V1",
+    "LLAMA3_70B_LORA_CONFIG_B300_FP8_MX_V1",
     "LLAMA3_70B_LORA_CONFIG_GB300_BF16_V1",
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@scripts/performance/configs/llama/__init__.py` around lines 208 - 219, The
__all__ list is missing the B300 LoRA constant names; update the module's export
list (the __all__ variable) to include the B300 LoRA symbols corresponding to
the GB300 entries—add "LLAMA3_70B_LORA_CONFIG_B300_BF16_V1",
"LLAMA3_70B_LORA_CONFIG_B300_FP8_CS_V1", and
"LLAMA3_70B_LORA_CONFIG_B300_FP8_MX_V1" to __all__ alongside the existing B200
and GB300 entries so the B300 configurations are part of the public API.

269-296: ⚠️ Potential issue | 🟠 Major

Missing llama3_70b_lora_config_b300 in the dynamic __all__ extension.

llama3_70b_lora_config_b200 is added at line 286, but llama3_70b_lora_config_b300 is missing from this list.

Proposed fix
             "llama3_70b_lora_config_b200",
+            "llama3_70b_lora_config_b300",
             "llama3_70b_lora_config_gb200",
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@scripts/performance/configs/llama/__init__.py` around lines 269 - 296, The
__all__ extension guarded by HAVE_MEGATRON_BRIDGE is missing the symbol
"llama3_70b_lora_config_b300"; update the list in the block that extends __all__
(the array containing "llama3_70b_lora_config_b200",
"llama3_70b_lora_config_gb200", "llama3_70b_lora_config_gb300",
"llama3_70b_lora_config_h100") to include "llama3_70b_lora_config_b300" so the
exported names match the defined configs.

9-19: ⚠️ Potential issue | 🟠 Major

Missing import and export of llama3_70b_lora_config_b300.

The B300 LoRA config function is defined in llama3_llm_finetune.py but is not imported here. Only llama3_70b_lora_config_b200 was added (line 12), while llama3_70b_lora_config_b300 was omitted. This means B300 LoRA won't be accessible via this package.

Proposed fix
     from .llama3_llm_finetune import (
         llama3_8b_sft_config_gb200,
         llama3_8b_sft_config_h100,
         llama3_70b_lora_config_b200,
+        llama3_70b_lora_config_b300,
         llama3_70b_lora_config_gb200,
         llama3_70b_lora_config_gb300,
         llama3_70b_lora_config_h100,
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@scripts/performance/configs/llama/__init__.py` around lines 9 - 19, The
package export is missing the llama3_70b_lora_config_b300 symbol; add
llama3_70b_lora_config_b300 to the import list in __init__.py (alongside
llama3_70b_lora_config_b200) so it is exported by the package; if there is an
__all__ or any exported names list in this module, include
llama3_70b_lora_config_b300 there as well to make the B300 LoRA config
accessible.
🤖 Fix all issues with AI agents
Verify each finding against the current code and only fix it if needed.


In `@scripts/performance/configs/llama/__init__.py`:
- Around line 64-72: The file is missing imports for the B300 LoRA constants
used in the module; add imports for LLAMA3_70B_LORA_CONFIG_B300_BF16_V1,
LLAMA3_70B_LORA_CONFIG_B300_FP8_CS_V1, and LLAMA3_70B_LORA_CONFIG_B300_FP8_MX_V1
from .llama3_workload_base_configs so the constants referenced in the exported
list (e.g., LLAMA3_70B_LORA_CONFIG_GB300_BF16_V1,
LLAMA3_70B_LORA_CONFIG_B300_FP8_MX_V1) are actually defined; update the same
import statement that currently imports the B200 constants to include these
three B300 symbols.
- Around line 208-219: The __all__ list is missing the B300 LoRA constant names;
update the module's export list (the __all__ variable) to include the B300 LoRA
symbols corresponding to the GB300 entries—add
"LLAMA3_70B_LORA_CONFIG_B300_BF16_V1", "LLAMA3_70B_LORA_CONFIG_B300_FP8_CS_V1",
and "LLAMA3_70B_LORA_CONFIG_B300_FP8_MX_V1" to __all__ alongside the existing
B200 and GB300 entries so the B300 configurations are part of the public API.
- Around line 269-296: The __all__ extension guarded by HAVE_MEGATRON_BRIDGE is
missing the symbol "llama3_70b_lora_config_b300"; update the list in the block
that extends __all__ (the array containing "llama3_70b_lora_config_b200",
"llama3_70b_lora_config_gb200", "llama3_70b_lora_config_gb300",
"llama3_70b_lora_config_h100") to include "llama3_70b_lora_config_b300" so the
exported names match the defined configs.
- Around line 9-19: The package export is missing the
llama3_70b_lora_config_b300 symbol; add llama3_70b_lora_config_b300 to the
import list in __init__.py (alongside llama3_70b_lora_config_b200) so it is
exported by the package; if there is an __all__ or any exported names list in
this module, include llama3_70b_lora_config_b300 there as well to make the B300
LoRA config accessible.

Signed-off-by: Raghav Hrishikeshan Mukundan <rmukundan@nvidia.com>
Signed-off-by: Raghav Hrishikeshan Mukundan <rmukundan@nvidia.com>
@rhmukundan rhmukundan added the r0.3.0 Cherry-pick label for r0.3.0 release branch label Feb 17, 2026
malay-nagda
malay-nagda previously approved these changes Feb 17, 2026
yaoyu-33
yaoyu-33 previously approved these changes Feb 23, 2026
@ko3n1g ko3n1g mentioned this pull request Feb 24, 2026
5 tasks
@malay-nagda
Copy link
Copy Markdown
Contributor

/ok to test 13ee9f6

Signed-off-by: Raghav Hrishikeshan Mukundan <rmukundan@nvidia.com>
@rhmukundan rhmukundan dismissed stale reviews from yaoyu-33 and malay-nagda via dd660c4 February 27, 2026 06:20
@rhmukundan rhmukundan force-pushed the rmukundan/onboard-llama3-lora-b200-b300 branch from 13ee9f6 to dd660c4 Compare February 27, 2026 06:20
@malay-nagda
Copy link
Copy Markdown
Contributor

/ok to test dd660c4

@malay-nagda malay-nagda self-requested a review February 27, 2026 06:40
malay-nagda pushed a commit that referenced this pull request Feb 27, 2026
Signed-off-by: Raghav Hrishikeshan Mukundan <rmukundan@nvidia.com>
malay-nagda added a commit that referenced this pull request Feb 27, 2026
Signed-off-by: Raghav Hrishikeshan Mukundan <rmukundan@nvidia.com>
Co-authored-by: Raghav Hrishikeshan Mukundan <102543536+rhmukundan@users.noreply.github.com>
kevalmorabia97 added a commit that referenced this pull request Feb 27, 2026
This merged yesterday, which removed llama3_70b_finetune_config from that file
b162358

This merged today but was a stale branch, still assuming llama3_70b_finetune_config was imported in that file
#2397

Signed-off-by: Keval Morabia <28916987+kevalmorabia97@users.noreply.github.com>
copy-pr-bot bot pushed a commit that referenced this pull request Mar 19, 2026
Signed-off-by: Raghav Hrishikeshan Mukundan <rmukundan@nvidia.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

r0.3.0 Cherry-pick label for r0.3.0 release branch

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants