Onboarding LLAMA3 70B LoRa to B300 and B200 chips by rhmukundan · Pull Request #2397 · NVIDIA-NeMo/Megatron-Bridge

rhmukundan · 2026-02-16T22:17:54Z

Summary by CodeRabbit

New Features

Added Llama3 70B LoRA configuration support for B200 and B300 GPUs
Available precision variants: BF16, FP8 Compressed Scaling, and FP8 MX formats
Configurations include optimized parallelism and transformer engine settings

Signed-off-by: Raghav Hrishikeshan Mukundan <rmukundan@nvidia.com>

copy-pr-bot · 2026-02-16T22:17:58Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

coderabbitai · 2026-02-17T00:17:28Z

📝 Walkthrough

Walkthrough

Adds B200 and B300 GPU variant support for Llama3 70B LoRA fine-tuning configurations. Introduces two new configuration builder functions, base configuration constants with multiple precision variants (BF16, FP8 CS, FP8 MX), and corresponding public API exports across the configuration module hierarchy.

Changes

Cohort / File(s)	Summary
Workload Base Configurations `scripts/performance/configs/llama/llama3_workload_base_configs.py`	Adds private base configs (_LLAMA3_70B_LORA_CONFIG_B300, _LLAMA3_70B_LORA_CONFIG_B200) with LORA and transformer_engine settings, plus six public variants (three each for B300 and B200) covering BF16, FP8_CS, and FP8_MX precision types.
Finetune Configuration Functions `scripts/performance/configs/llama/llama3_llm_finetune.py`	Implements two new configuration builder functions (llama3_70b_lora_config_b300, llama3_70b_lora_config_b200) that construct ConfigContainer objects with LORA settings, packed sequence support, and target_modules set to ["linear_qkv"].
Module Exports `scripts/performance/configs/llama/__init__.py`	Exposes new B200 LoRA function and eight new public constants (B300 and B200 variants) to the public API surface when Megatron bridge is available.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

cp: Updating Configs for LLAMA3 70B LoRa (2292) into r0.3.0 #2311 — Concurrent modifications to Llama3 70B LoRA configuration functions and constants in the same config modules.
Updating Configs for LLAMA3 70B LoRa #2292 — Related updates to Llama3 70B LoRA config surface, including seq_length and parallelism settings for similar variants.
cp: LLAMA3 70B: LoRa enabled in all modules instead of only LinearQKV (2181) into r0.3.0 #2310 — Modifies target_modules behavior in llama3_llm_finetune.py for 70B LoRA configurations, directly intersecting with target_modules ["linear_qkv"] setting introduced here.

Suggested labels

performance

Suggested reviewers

erhoo82
malay-nagda

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Test Results For Major Changes	⚠️ Warning	PR introduces B200/B300 LoRA configurations without test results, performance benchmarks, or convergence validation, and B300 constants are missing from init.py imports.	Include test results validating B200/B300 LoRA configurations and fix missing B300 constant imports in init.py file.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately reflects the main changes: adding Llama3 70B LoRA configurations for B300 and B200 chips across multiple configuration files.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Merge Conflict Detection	✅ Passed	✅ No merge conflicts detected when merging into `main`

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch rmukundan/onboard-llama3-lora-b200-b300

Tip

Issue Planner is now in beta. Read the docs and try it out! Share your feedback on Discord.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (3)

scripts/performance/configs/llama/__init__.py (3)

208-219: ⚠️ Potential issue | 🟠 Major

Missing B300 LoRA constants in __all__.

Only the B200 LoRA constants are added to __all__ (lines 212-214). The B300 LoRA constants should also be listed here for a complete public API surface.

Proposed fix

     "LLAMA3_70B_LORA_CONFIG_B200_BF16_V1",
     "LLAMA3_70B_LORA_CONFIG_B200_FP8_CS_V1",
     "LLAMA3_70B_LORA_CONFIG_B200_FP8_MX_V1",
+    "LLAMA3_70B_LORA_CONFIG_B300_BF16_V1",
+    "LLAMA3_70B_LORA_CONFIG_B300_FP8_CS_V1",
+    "LLAMA3_70B_LORA_CONFIG_B300_FP8_MX_V1",
     "LLAMA3_70B_LORA_CONFIG_GB300_BF16_V1",

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@scripts/performance/configs/llama/__init__.py` around lines 208 - 219, The
__all__ list is missing the B300 LoRA constant names; update the module's export
list (the __all__ variable) to include the B300 LoRA symbols corresponding to
the GB300 entries—add "LLAMA3_70B_LORA_CONFIG_B300_BF16_V1",
"LLAMA3_70B_LORA_CONFIG_B300_FP8_CS_V1", and
"LLAMA3_70B_LORA_CONFIG_B300_FP8_MX_V1" to __all__ alongside the existing B200
and GB300 entries so the B300 configurations are part of the public API.

269-296: ⚠️ Potential issue | 🟠 Major

Missing llama3_70b_lora_config_b300 in the dynamic __all__ extension.

llama3_70b_lora_config_b200 is added at line 286, but llama3_70b_lora_config_b300 is missing from this list.

Proposed fix

             "llama3_70b_lora_config_b200",
+            "llama3_70b_lora_config_b300",
             "llama3_70b_lora_config_gb200",

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@scripts/performance/configs/llama/__init__.py` around lines 269 - 296, The
__all__ extension guarded by HAVE_MEGATRON_BRIDGE is missing the symbol
"llama3_70b_lora_config_b300"; update the list in the block that extends __all__
(the array containing "llama3_70b_lora_config_b200",
"llama3_70b_lora_config_gb200", "llama3_70b_lora_config_gb300",
"llama3_70b_lora_config_h100") to include "llama3_70b_lora_config_b300" so the
exported names match the defined configs.

9-19: ⚠️ Potential issue | 🟠 Major

Missing import and export of llama3_70b_lora_config_b300.

The B300 LoRA config function is defined in llama3_llm_finetune.py but is not imported here. Only llama3_70b_lora_config_b200 was added (line 12), while llama3_70b_lora_config_b300 was omitted. This means B300 LoRA won't be accessible via this package.

Proposed fix

     from .llama3_llm_finetune import (
         llama3_8b_sft_config_gb200,
         llama3_8b_sft_config_h100,
         llama3_70b_lora_config_b200,
+        llama3_70b_lora_config_b300,
         llama3_70b_lora_config_gb200,
         llama3_70b_lora_config_gb300,
         llama3_70b_lora_config_h100,

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@scripts/performance/configs/llama/__init__.py` around lines 9 - 19, The
package export is missing the llama3_70b_lora_config_b300 symbol; add
llama3_70b_lora_config_b300 to the import list in __init__.py (alongside
llama3_70b_lora_config_b200) so it is exported by the package; if there is an
__all__ or any exported names list in this module, include
llama3_70b_lora_config_b300 there as well to make the B300 LoRA config
accessible.

🤖 Fix all issues with AI agents

Verify each finding against the current code and only fix it if needed.


In `@scripts/performance/configs/llama/__init__.py`:
- Around line 64-72: The file is missing imports for the B300 LoRA constants
used in the module; add imports for LLAMA3_70B_LORA_CONFIG_B300_BF16_V1,
LLAMA3_70B_LORA_CONFIG_B300_FP8_CS_V1, and LLAMA3_70B_LORA_CONFIG_B300_FP8_MX_V1
from .llama3_workload_base_configs so the constants referenced in the exported
list (e.g., LLAMA3_70B_LORA_CONFIG_GB300_BF16_V1,
LLAMA3_70B_LORA_CONFIG_B300_FP8_MX_V1) are actually defined; update the same
import statement that currently imports the B200 constants to include these
three B300 symbols.
- Around line 208-219: The __all__ list is missing the B300 LoRA constant names;
update the module's export list (the __all__ variable) to include the B300 LoRA
symbols corresponding to the GB300 entries—add
"LLAMA3_70B_LORA_CONFIG_B300_BF16_V1", "LLAMA3_70B_LORA_CONFIG_B300_FP8_CS_V1",
and "LLAMA3_70B_LORA_CONFIG_B300_FP8_MX_V1" to __all__ alongside the existing
B200 and GB300 entries so the B300 configurations are part of the public API.
- Around line 269-296: The __all__ extension guarded by HAVE_MEGATRON_BRIDGE is
missing the symbol "llama3_70b_lora_config_b300"; update the list in the block
that extends __all__ (the array containing "llama3_70b_lora_config_b200",
"llama3_70b_lora_config_gb200", "llama3_70b_lora_config_gb300",
"llama3_70b_lora_config_h100") to include "llama3_70b_lora_config_b300" so the
exported names match the defined configs.
- Around line 9-19: The package export is missing the
llama3_70b_lora_config_b300 symbol; add llama3_70b_lora_config_b300 to the
import list in __init__.py (alongside llama3_70b_lora_config_b200) so it is
exported by the package; if there is an __all__ or any exported names list in
this module, include llama3_70b_lora_config_b300 there as well to make the B300
LoRA config accessible.

scripts/performance/configs/llama/__init__.py

Signed-off-by: Raghav Hrishikeshan Mukundan <rmukundan@nvidia.com>

scripts/performance/configs/llama/__init__.py

Signed-off-by: Raghav Hrishikeshan Mukundan <rmukundan@nvidia.com>

malay-nagda · 2026-02-27T05:57:37Z

/ok to test 13ee9f6

Signed-off-by: Raghav Hrishikeshan Mukundan <rmukundan@nvidia.com>

malay-nagda · 2026-02-27T06:40:16Z

/ok to test dd660c4

Signed-off-by: Raghav Hrishikeshan Mukundan <rmukundan@nvidia.com>

Signed-off-by: Raghav Hrishikeshan Mukundan <rmukundan@nvidia.com> Co-authored-by: Raghav Hrishikeshan Mukundan <102543536+rhmukundan@users.noreply.github.com>

This merged yesterday, which removed llama3_70b_finetune_config from that file b162358 This merged today but was a stale branch, still assuming llama3_70b_finetune_config was imported in that file #2397 Signed-off-by: Keval Morabia <28916987+kevalmorabia97@users.noreply.github.com>

Signed-off-by: Raghav Hrishikeshan Mukundan <rmukundan@nvidia.com>

Onboarding LLAMA3 70B LoRa to B300 and B200 chips

65949ed

Signed-off-by: Raghav Hrishikeshan Mukundan <rmukundan@nvidia.com>

rhmukundan requested a review from malay-nagda February 16, 2026 22:17

rhmukundan self-assigned this Feb 16, 2026

rhmukundan requested a review from erhoo82 February 17, 2026 00:13

rhmukundan marked this pull request as ready for review February 17, 2026 00:13

coderabbitai bot reviewed Feb 17, 2026

View reviewed changes

scripts/performance/configs/llama/__init__.py Outdated Show resolved Hide resolved

Fix b300 import

d17a146

Signed-off-by: Raghav Hrishikeshan Mukundan <rmukundan@nvidia.com>

malay-nagda reviewed Feb 17, 2026

View reviewed changes

scripts/performance/configs/llama/__init__.py Show resolved Hide resolved

scripts/performance/configs/llama/__init__.py Show resolved Hide resolved

Fix B300 Imports

7fa8131

Signed-off-by: Raghav Hrishikeshan Mukundan <rmukundan@nvidia.com>

rhmukundan requested a review from malay-nagda February 17, 2026 13:44

Merge branch 'main' into rmukundan/onboard-llama3-lora-b200-b300

710605d

rhmukundan added the r0.3.0 Cherry-pick label for r0.3.0 release branch label Feb 17, 2026

malay-nagda previously approved these changes Feb 17, 2026

View reviewed changes

yaoyu-33 approved these changes Feb 23, 2026

View reviewed changes

yaoyu-33 previously approved these changes Feb 23, 2026

View reviewed changes

rhmukundan added 2 commits February 23, 2026 12:55

Merge branch 'main' into rmukundan/onboard-llama3-lora-b200-b300

3e80bde

Merge branch 'main' into rmukundan/onboard-llama3-lora-b200-b300

fa044cc

ko3n1g mentioned this pull request Feb 24, 2026

260201: Cherrypick various changes #2509

Merged

5 tasks

Merge branch 'main' into rmukundan/onboard-llama3-lora-b200-b300

044086c

Fix linter errors

dd660c4

Signed-off-by: Raghav Hrishikeshan Mukundan <rmukundan@nvidia.com>

rhmukundan dismissed stale reviews from yaoyu-33 and malay-nagda via dd660c4 February 27, 2026 06:20

rhmukundan force-pushed the rmukundan/onboard-llama3-lora-b200-b300 branch from 13ee9f6 to dd660c4 Compare February 27, 2026 06:20

malay-nagda self-requested a review February 27, 2026 06:40

malay-nagda approved these changes Feb 27, 2026

View reviewed changes

copy-pr-bot bot temporarily deployed to test February 27, 2026 06:41 Inactive

copy-pr-bot bot temporarily deployed to nemo-ci February 27, 2026 07:25 Inactive

malay-nagda pushed a commit that referenced this pull request Feb 27, 2026

Onboarding LLAMA3 70B LoRa to B300 and B200 chips (#2397)

57f51fc

Signed-off-by: Raghav Hrishikeshan Mukundan <rmukundan@nvidia.com>

malay-nagda mentioned this pull request Feb 27, 2026

Onboarding LLAMA3 70B LoRa to B300 and B200 chips (#2397) #2588

Merged

5 tasks

kevalmorabia97 mentioned this pull request Feb 27, 2026

Fix lint and import error in perf script llama3_llm_finetune.py #2592

Merged

5 tasks

copy-pr-bot bot pushed a commit that referenced this pull request Mar 19, 2026

Onboarding LLAMA3 70B LoRa to B300 and B200 chips (#2397)

7db98b8

Signed-off-by: Raghav Hrishikeshan Mukundan <rmukundan@nvidia.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Onboarding LLAMA3 70B LoRa to B300 and B200 chips#2397

Onboarding LLAMA3 70B LoRa to B300 and B200 chips#2397
malay-nagda merged 8 commits intomainfrom
rmukundan/onboard-llama3-lora-b200-b300

rhmukundan commented Feb 16, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

copy-pr-bot bot commented Feb 16, 2026

Uh oh!

coderabbitai bot commented Feb 17, 2026

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested labels

Suggested reviewers

❌ Failed checks (1 warning)

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

malay-nagda commented Feb 27, 2026

Uh oh!

malay-nagda commented Feb 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

rhmukundan commented Feb 16, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

New Features

Uh oh!

copy-pr-bot bot commented Feb 16, 2026

Uh oh!

coderabbitai bot commented Feb 17, 2026

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested labels

Suggested reviewers

❌ Failed checks (1 warning)

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

malay-nagda commented Feb 27, 2026

Uh oh!

malay-nagda commented Feb 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

rhmukundan commented Feb 16, 2026 •

edited by coderabbitai bot

Loading