Skip to content

Set HF_HUB_OFFLINE=1 by default, enabled with '--hf_token' flag.#2086

Merged
nv-mollys merged 2 commits intoNVIDIA-NeMo:mainfrom
sudostock:hf_hub_offline-main
Jan 28, 2026
Merged

Set HF_HUB_OFFLINE=1 by default, enabled with '--hf_token' flag.#2086
nv-mollys merged 2 commits intoNVIDIA-NeMo:mainfrom
sudostock:hf_hub_offline-main

Conversation

@sudostock
Copy link
Copy Markdown
Contributor

@sudostock sudostock commented Jan 27, 2026

This makes it consistent with the TRANSFORMER_OFFLINE variable.

Many mbridge recipes require hf data for either config information and/or tokenizer. This is causing many issues both internally and at customer sites as the HF connections have a tendency to get rate limited especially when launching many jobs.

To enable 'offline mode', we need 'HF_HUB_OFFLINE=1' and the necessary files in the cache locally. We've aligned 'HF_HUB_OFFLINE' to operate like TRANFORMERS_OFFLINE. Default to offline unless '--hf_token' is specified. Not all models require an hf_token to access however the rate limits are strict enough that it is worth enforcing.

Summary by CodeRabbit

Release Notes

  • Bug Fixes

    • Improved offline mode handling for Hugging Face Hub access in performance tests, ensuring online downloads are properly enabled when authentication credentials are available.
  • Chores

    • Refined environment variable configuration logic for enhanced clarity.

✏️ Tip: You can customize this high-level summary in your review settings.

This makes it consistent with the TRANSFORMER_OFFLINE variable.

Signed-off-by: Alex Filby <afilby@nvidia.com>
@copy-pr-bot
Copy link
Copy Markdown

copy-pr-bot bot commented Jan 27, 2026

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Jan 27, 2026

📝 Walkthrough

Walkthrough

Default environment variable HF_HUB_OFFLINE changed from "0" to "1" in PERF_ENV_VARS. When an HF token is provided to slurm_executor, environment variable assignments now use explicit statements instead of dict.update(), with HF_HUB_OFFLINE explicitly set to "0".

Changes

Cohort / File(s) Summary
Environment configuration in slurm executor
scripts/performance/utils/executors.py
HF_HUB_OFFLINE default changed to "1"; when hf_token is provided, replaced dict.update() with explicit assignments for HF_TOKEN, TRANSFORMERS_OFFLINE, and HF_HUB_OFFLINE

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~8 minutes

🚥 Pre-merge checks | ✅ 3 | ❌ 1
❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Test Results For Major Changes ⚠️ Warning PR makes major behavioral change to HuggingFace model access without documenting test results, verification of offline mode functionality, or convergence/performance validation. Document test results for offline mode with all affected recipes, confirm HF token flag enables online mode, validate no regressions, and resolve global state mutation issue in PERF_ENV_VARS.
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and specifically describes the main change: setting HF_HUB_OFFLINE=1 by default and enabling it conditionally with the '--hf_token' flag, which aligns with the actual modifications in the changeset.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@scripts/performance/utils/executors.py`:
- Around line 109-112: The code mutates the global PERF_ENV_VARS in-place when
hf_token is provided, causing HF_HUB_OFFLINE/TRANSFORMERS_OFFLINE to persist
across calls/threads; fix by creating a per-call copy (e.g., local_env =
PERF_ENV_VARS.copy()) inside the executor before modifying and use local_env for
subprocess/env injection instead of mutating PERF_ENV_VARS; update the branch
that checks hf_token to set values on local_env and ensure any code that
previously referenced PERF_ENV_VARS in this execution uses local_env
(referencing PERF_ENV_VARS and hf_token to locate the change).

Comment on lines 109 to +112
if hf_token is not None:
PERF_ENV_VARS.update({"HF_TOKEN": hf_token, "TRANSFORMERS_OFFLINE": "0"})
PERF_ENV_VARS["HF_TOKEN"] = hf_token
PERF_ENV_VARS["TRANSFORMERS_OFFLINE"] = "0"
PERF_ENV_VARS["HF_HUB_OFFLINE"] = "0"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Avoid global env var leakage across calls.
Mutating PERF_ENV_VARS in-place makes HF_HUB_OFFLINE/TRANSFORMERS_OFFLINE “sticky” across future invocations (and across threads). Use a per-call copy instead.

✅ Proposed fix (use a local copy)
 def slurm_executor(
@@
 ) -> run.SlurmExecutor:
@@
-    if wandb_key is not None:
-        PERF_ENV_VARS["WANDB_API_KEY"] = wandb_key
+    env_vars = PERF_ENV_VARS.copy()
+    if wandb_key is not None:
+        env_vars["WANDB_API_KEY"] = wandb_key
@@
-        PERF_ENV_VARS["NCCL_NET_GDR_LEVEL"] = "PHB"  # For NCCL 2.25
-        PERF_ENV_VARS["NCCL_NET_GDR_C2C"] = "1"  # For NCCL 2.26
+        env_vars["NCCL_NET_GDR_LEVEL"] = "PHB"  # For NCCL 2.25
+        env_vars["NCCL_NET_GDR_C2C"] = "1"  # For NCCL 2.26
@@
-        PERF_ENV_VARS["NEMO_HOME"] = nemo_home
+        env_vars["NEMO_HOME"] = nemo_home
@@
-        PERF_ENV_VARS["HF_TOKEN"] = hf_token
-        PERF_ENV_VARS["TRANSFORMERS_OFFLINE"] = "0"
-        PERF_ENV_VARS["HF_HUB_OFFLINE"] = "0"
+        env_vars["HF_TOKEN"] = hf_token
+        env_vars["TRANSFORMERS_OFFLINE"] = "0"
+        env_vars["HF_HUB_OFFLINE"] = "0"
@@
-    PERF_ENV_VARS.update(custom_env_vars)
+    env_vars.update(custom_env_vars)
@@
-        env_vars=PERF_ENV_VARS,
+        env_vars=env_vars,
🤖 Prompt for AI Agents
In `@scripts/performance/utils/executors.py` around lines 109 - 112, The code
mutates the global PERF_ENV_VARS in-place when hf_token is provided, causing
HF_HUB_OFFLINE/TRANSFORMERS_OFFLINE to persist across calls/threads; fix by
creating a per-call copy (e.g., local_env = PERF_ENV_VARS.copy()) inside the
executor before modifying and use local_env for subprocess/env injection instead
of mutating PERF_ENV_VARS; update the branch that checks hf_token to set values
on local_env and ensure any code that previously referenced PERF_ENV_VARS in
this execution uses local_env (referencing PERF_ENV_VARS and hf_token to locate
the change).

Copy link
Copy Markdown
Contributor

@malay-nagda malay-nagda left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this fix!

@malay-nagda malay-nagda requested a review from ko3n1g January 28, 2026 07:36
@nv-mollys nv-mollys enabled auto-merge (squash) January 28, 2026 17:59
@nv-mollys
Copy link
Copy Markdown
Contributor

/ok to test 17b4744

@nv-mollys nv-mollys merged commit 8a937f6 into NVIDIA-NeMo:main Jan 28, 2026
48 checks passed
conver334 pushed a commit to conver334/Megatron-Bridge that referenced this pull request Jan 30, 2026
…DIA-NeMo#2086)

Signed-off-by: Alex Filby <afilby@nvidia.com>
Signed-off-by: conver334 <conver334@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants