Skip to content

LLAMA3 70B: LoRa enabled in all modules instead of only LinearQKV#2181

Merged
ko3n1g merged 8 commits intomainfrom
rhmukundan/llama3_70b_qkv_only_lora
Feb 10, 2026
Merged

LLAMA3 70B: LoRa enabled in all modules instead of only LinearQKV#2181
ko3n1g merged 8 commits intomainfrom
rhmukundan/llama3_70b_qkv_only_lora

Conversation

@rhmukundan
Copy link
Copy Markdown
Contributor

@rhmukundan rhmukundan commented Feb 2, 2026

Disable LoRa in linear_proj, linear_fc1, and linear_fc2, retaining it solely in linear_qkv.

Summary by CodeRabbit

  • Chores
    • Refined Llama 3 70B LoRA fine-tuning configuration to target specific model components, improving precision and efficiency during model adaptation.

Signed-off-by: Raghav Hrishikeshan Mukundan <rmukundan@nvidia.com>
@rhmukundan rhmukundan self-assigned this Feb 2, 2026
@copy-pr-bot
Copy link
Copy Markdown

copy-pr-bot bot commented Feb 2, 2026

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Feb 2, 2026

📝 Walkthrough

Walkthrough

This change adds a LoRA configuration override to three Llama 3 70B model configuration functions, constraining PEFT target_modules to QKV (Query, Key, Value) modules. The modification is a straightforward configuration addition with no control flow changes.

Changes

Cohort / File(s) Summary
LoRA Configuration Override
scripts/performance/configs/llama/llama3_llm_finetune.py
Adds target_modules constraint to QKV across gb300, gb200, and h100 Llama 3 70B LoRA configuration functions, refining PEFT application scope.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~8 minutes

🚥 Pre-merge checks | ✅ 2 | ❌ 2
❌ Failed checks (2 warnings)
Check name Status Explanation Resolution
Title check ⚠️ Warning The PR title claims LoRa is being enabled in all modules, but the description and changes clarify it's actually being constrained to only linear_qkv, contradicting the title's assertion. Revise the title to accurately reflect the change, such as: 'LLAMA3 70B: LoRa configuration constrained to only linear_qkv modules' or similar phrasing that matches the actual implementation.
Test Results For Major Changes ⚠️ Warning PR contains significant LoRA configuration change but lacks documentation of test results, convergence metrics, or performance validation. Add test results to PR description including convergence behavior, before-and-after loss curves, and performance impact with specific testing configuration details.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch rhmukundan/llama3_70b_qkv_only_lora

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@erhoo82 erhoo82 added the r0.3.0 Cherry-pick label for r0.3.0 release branch label Feb 3, 2026
@erhoo82 erhoo82 added this to the 26.02 milestone Feb 3, 2026
@rhmukundan rhmukundan enabled auto-merge (squash) February 4, 2026 17:11
@rhmukundan
Copy link
Copy Markdown
Contributor Author

/ok to test b22d740

@rhmukundan
Copy link
Copy Markdown
Contributor Author

/ok to test 04a9849

@rhmukundan
Copy link
Copy Markdown
Contributor Author

/ok to test 71db322

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

r0.3.0 Cherry-pick label for r0.3.0 release branch

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants