Skip to content

update lora kernels docs#3186

Closed
djsaunde wants to merge 2 commits into
mainfrom
lora-fsdp2-doc
Closed

update lora kernels docs#3186
djsaunde wants to merge 2 commits into
mainfrom
lora-fsdp2-doc

Conversation

@djsaunde

@djsaunde djsaunde commented Sep 25, 2025

Copy link
Copy Markdown
Collaborator

Description

Motivation and Context

How has this been tested?

Screenshots (if appropriate)

Types of changes

Social Handles (Optional)

@djsaunde djsaunde self-assigned this Sep 25, 2025
@coderabbitai

coderabbitai Bot commented Sep 25, 2025

Copy link
Copy Markdown
Contributor
📝 Walkthrough

Walkthrough

Updates documentation to reflect FSDP2 support and revised LoRA dropout/bias policy. Simplifies kernel patch eligibility to require only lora_dropout: 0. Refactors tests to validate the new condition and confirm patches apply with bias enabled. No public APIs changed.

Changes

Cohort / File(s) Summary
Docs update
docs/lora_optims.qmd
Reworded support scope to include FSDP2; updated LoRA dropout/bias policy; adjusted notes on adapters and future work to reflect dropout requirement and bias support.
LoRA kernel patch logic
src/axolotl/monkeypatch/lora_kernels.py
Simplified patch eligibility: require only lora_dropout == 0; updated docstring and warning messages; no API changes.
Tests for kernel patching
tests/e2e/patched/lora_kernels/test_lora_kernel_patching.py
Replaced combined condition test with two tests: one ensures patches don’t apply when dropout > 0; another ensures patches do apply with bias enabled; removed old aggregated test.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Possibly related PRs

Suggested labels

ready to merge

Suggested reviewers

  • NanoCode012
  • winglian

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Title Check ⚠️ Warning The title only mentions documentation updates but the pull request also modifies the core LoRA kernel patching logic and tests to simplify dropout conditions and enable bias support, so it fails to capture the full scope of the changes. By focusing solely on docs, the title can mislead reviewers about the primary code changes. Consequently, it does not accurately summarize the main intent of the pull request. Please update the title to reflect both documentation and code changes, for example “Simplify LoRA kernel patching conditions and update documentation for dropout and bias support.”
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Docstring Coverage ✅ Passed No functions found in the changes. Docstring coverage check skipped.
✨ Finishing touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch lora-fsdp2-doc

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (7)
src/axolotl/monkeypatch/lora_kernels.py (4)

325-328: Clarify Note: also call out DoRA and multi‑adapter limitations.

The code skips patching when dropout is non‑zero, and per‑layer patching also rejects DoRA (lora_magnitude_vector). Document both, plus the single‑adapter constraint, to reduce confusion.

Apply this doc tweak:

-    Note:
-        The optimizations require LoRA adapters with no dropout. The function will skip
-        patching if that condition isn't met.
+    Note:
+        - The optimizations require LoRA adapters with no dropout (`lora_dropout: 0`);
+          otherwise, patching is skipped.
+        - DoRA (`lora_magnitude_vector`) is not currently supported; affected projections
+          will log a warning and remain unpatched.
+        - Multiple active adapters are not supported (single active adapter required).

343-349: Include actual dropout value in warning.

Improves debuggability when patches are skipped.

-    if not can_patch:
-        LOG.warning("Cannot patch layers - requires `lora_dropout: 0`")
+    if not can_patch:
+        LOG.warning(
+            f"Cannot patch layers - requires `lora_dropout: 0` (got {lora_config.lora_dropout})"
+        )
         LOG.warning("Please specify `lora_dropout: 0` in your axolotl config file")
         return model

350-354: Ensure log level is restored on exceptions.

Wrap the temporary log level change in a try/finally to avoid leaking INFO level on early errors (e.g., unsupported activation).

Example:

original_level = LOG.getEffectiveLevel()
try:
    LOG.setLevel(logging.INFO)
    # ... existing patching logic ...
finally:
    LOG.setLevel(original_level)

Also applies to: 440-442


236-246: Type hint nit: yielded type is nn.Module, not Tuple[nn.Module].

The generator yields a single module. Adjust the annotation.

-def find_self_attn_in_layer(
-    layer: nn.Module,
-) -> Generator[Tuple[nn.Module], None, None]:
+def find_self_attn_in_layer(
+    layer: nn.Module,
+) -> Generator[nn.Module, None, None]:
docs/lora_optims.qmd (1)

96-101: Document DoRA limitation and single‑adapter constraint.

Code rejects DoRA per‑layer and enforces a single active adapter. Add bullets so users don’t discover this at runtime.

 - Targeted LoRA adapters must disable dropout (`lora_dropout: 0`)
     - This may limit model expressivity
 - Adapters that already include bias terms are supported.
+ - DoRA (`lora_magnitude_vector`) is not supported; affected projections will remain unpatched.
+ - Multiple active adapters are not supported (only one active LoRA adapter).
tests/e2e/patched/lora_kernels/test_lora_kernel_patching.py (2)

224-247: Make the test offline and strengthen assertions.

Avoid network/model downloads by using the existing small LLaMA fixture. Also assert that attention methods weren’t injected when skipping patches.

-def test_kernel_patch_requires_zero_dropout():
+def test_kernel_patch_requires_zero_dropout(small_llama_model):
     """Kernel patching should be skipped when dropout is enabled."""
     config = {
         "peft_type": "LORA",
         "task_type": "CAUSAL_LM",
         "r": 8,
         "lora_alpha": 16,
         "target_modules": ["gate_proj", "up_proj", "down_proj"],
         "lora_dropout": 0.1,
         "bias": "none",
     }
 
-    model = AutoModelForCausalLM.from_pretrained("HuggingFaceTB/SmolLM2-135M")
-    peft_config = get_peft_config(config)
-    model = PeftModelForCausalLM(model, peft_config)
+    peft_config = get_peft_config(config)
+    model = PeftModelForCausalLM(small_llama_model, peft_config)
     cfg = DictDefault({"lora_mlp_kernel": True})
 
     patched_model = apply_lora_kernel_patches(model, cfg)
     layer = patched_model.model.model.layers[0].mlp
 
     # Verify no patches applied when dropout is non-zero
     assert layer.forward.__func__ is not apply_lora_mlp_swiglu
     assert layer.forward.__func__ is not apply_lora_mlp_geglu
+    # Attention should also remain unmodified (methods not injected)
+    self_attn = patched_model.model.model.layers[0].self_attn
+    assert not hasattr(self_attn, "apply_qkv")
+    assert not hasattr(self_attn, "apply_o")

249-271: Same: avoid downloads; rely on the small fixture.

Keeps the suite faster and more reliable while still validating bias support.

-def test_kernel_patch_with_bias_enabled():
+def test_kernel_patch_with_bias_enabled(small_llama_model):
     """Kernel patching should succeed when LoRA bias is enabled."""
     config = {
         "peft_type": "LORA",
         "task_type": "CAUSAL_LM",
         "r": 8,
         "lora_alpha": 16,
         "target_modules": ["gate_proj", "up_proj", "down_proj"],
         "lora_dropout": 0,
         "bias": "lora_only",
     }
 
-    model = AutoModelForCausalLM.from_pretrained("HuggingFaceTB/SmolLM2-135M")
-    peft_config = get_peft_config(config)
-    model = PeftModelForCausalLM(model, peft_config)
+    peft_config = get_peft_config(config)
+    model = PeftModelForCausalLM(small_llama_model, peft_config)
     cfg = DictDefault({"lora_mlp_kernel": True})
 
     patched_model = apply_lora_kernel_patches(model, cfg)
     layer = patched_model.model.model.layers[0].mlp
 
     # Verify patches applied when bias support is enabled
     assert layer.forward.__func__ is apply_lora_mlp_swiglu
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between f9748c4 and 3299f18.

📒 Files selected for processing (3)
  • docs/lora_optims.qmd (3 hunks)
  • src/axolotl/monkeypatch/lora_kernels.py (2 hunks)
  • tests/e2e/patched/lora_kernels/test_lora_kernel_patching.py (1 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
tests/e2e/patched/lora_kernels/test_lora_kernel_patching.py (3)
src/axolotl/utils/dict.py (1)
  • DictDefault (6-38)
src/axolotl/monkeypatch/lora_kernels.py (1)
  • apply_lora_kernel_patches (303-442)
src/axolotl/kernels/lora.py (5)
  • forward (133-211)
  • forward (484-547)
  • forward (739-773)
  • apply_lora_mlp_swiglu (389-429)
  • apply_lora_mlp_geglu (432-471)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (7)
  • GitHub Check: PyTest (3.11, 2.7.1)
  • GitHub Check: PyTest from Source Dist (3.11, 2.8.0)
  • GitHub Check: PyTest (3.11, 2.8.0)
  • GitHub Check: PyTest from Source Dist (3.11, 2.6.0)
  • GitHub Check: PyTest from Source Dist (3.11, 2.7.1)
  • GitHub Check: PyTest (3.11, 2.6.0)
  • GitHub Check: preview
🔇 Additional comments (2)
docs/lora_optims.qmd (2)

8-12: LGTM: FSDP2 mention and clearer kernel overview.

The expanded scope and phrasing look good.


131-135: LGTM: “Support for dropout” future work aligns with code.

Matches the current gating in monkeypatch logic.

@github-actions

Copy link
Copy Markdown
Contributor

📖 Documentation Preview: https://68d5723b14499b64527dbcdf--resonant-treacle-0fd729.netlify.app

Deployed on Netlify from commit 3299f18

@codecov

codecov Bot commented Sep 25, 2025

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 0% with 2 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
src/axolotl/monkeypatch/lora_kernels.py 0.00% 2 Missing ⚠️

📢 Thoughts on this report? Let us know!

@djsaunde djsaunde changed the title Lora kernels bias ungate + docs update lora kernels docs Sep 25, 2025
@djsaunde djsaunde closed this Sep 25, 2025
@djsaunde

Copy link
Copy Markdown
Collaborator Author

ignore, not quite right

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant