Set HF_HUB_OFFLINE=1 by default, enabled with '--hf_token' flag.#2086
Set HF_HUB_OFFLINE=1 by default, enabled with '--hf_token' flag.#2086nv-mollys merged 2 commits intoNVIDIA-NeMo:mainfrom
Conversation
This makes it consistent with the TRANSFORMER_OFFLINE variable. Signed-off-by: Alex Filby <afilby@nvidia.com>
📝 WalkthroughWalkthroughDefault environment variable HF_HUB_OFFLINE changed from "0" to "1" in PERF_ENV_VARS. When an HF token is provided to slurm_executor, environment variable assignments now use explicit statements instead of dict.update(), with HF_HUB_OFFLINE explicitly set to "0". Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~8 minutes 🚥 Pre-merge checks | ✅ 3 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@scripts/performance/utils/executors.py`:
- Around line 109-112: The code mutates the global PERF_ENV_VARS in-place when
hf_token is provided, causing HF_HUB_OFFLINE/TRANSFORMERS_OFFLINE to persist
across calls/threads; fix by creating a per-call copy (e.g., local_env =
PERF_ENV_VARS.copy()) inside the executor before modifying and use local_env for
subprocess/env injection instead of mutating PERF_ENV_VARS; update the branch
that checks hf_token to set values on local_env and ensure any code that
previously referenced PERF_ENV_VARS in this execution uses local_env
(referencing PERF_ENV_VARS and hf_token to locate the change).
| if hf_token is not None: | ||
| PERF_ENV_VARS.update({"HF_TOKEN": hf_token, "TRANSFORMERS_OFFLINE": "0"}) | ||
| PERF_ENV_VARS["HF_TOKEN"] = hf_token | ||
| PERF_ENV_VARS["TRANSFORMERS_OFFLINE"] = "0" | ||
| PERF_ENV_VARS["HF_HUB_OFFLINE"] = "0" |
There was a problem hiding this comment.
Avoid global env var leakage across calls.
Mutating PERF_ENV_VARS in-place makes HF_HUB_OFFLINE/TRANSFORMERS_OFFLINE “sticky” across future invocations (and across threads). Use a per-call copy instead.
✅ Proposed fix (use a local copy)
def slurm_executor(
@@
) -> run.SlurmExecutor:
@@
- if wandb_key is not None:
- PERF_ENV_VARS["WANDB_API_KEY"] = wandb_key
+ env_vars = PERF_ENV_VARS.copy()
+ if wandb_key is not None:
+ env_vars["WANDB_API_KEY"] = wandb_key
@@
- PERF_ENV_VARS["NCCL_NET_GDR_LEVEL"] = "PHB" # For NCCL 2.25
- PERF_ENV_VARS["NCCL_NET_GDR_C2C"] = "1" # For NCCL 2.26
+ env_vars["NCCL_NET_GDR_LEVEL"] = "PHB" # For NCCL 2.25
+ env_vars["NCCL_NET_GDR_C2C"] = "1" # For NCCL 2.26
@@
- PERF_ENV_VARS["NEMO_HOME"] = nemo_home
+ env_vars["NEMO_HOME"] = nemo_home
@@
- PERF_ENV_VARS["HF_TOKEN"] = hf_token
- PERF_ENV_VARS["TRANSFORMERS_OFFLINE"] = "0"
- PERF_ENV_VARS["HF_HUB_OFFLINE"] = "0"
+ env_vars["HF_TOKEN"] = hf_token
+ env_vars["TRANSFORMERS_OFFLINE"] = "0"
+ env_vars["HF_HUB_OFFLINE"] = "0"
@@
- PERF_ENV_VARS.update(custom_env_vars)
+ env_vars.update(custom_env_vars)
@@
- env_vars=PERF_ENV_VARS,
+ env_vars=env_vars,🤖 Prompt for AI Agents
In `@scripts/performance/utils/executors.py` around lines 109 - 112, The code
mutates the global PERF_ENV_VARS in-place when hf_token is provided, causing
HF_HUB_OFFLINE/TRANSFORMERS_OFFLINE to persist across calls/threads; fix by
creating a per-call copy (e.g., local_env = PERF_ENV_VARS.copy()) inside the
executor before modifying and use local_env for subprocess/env injection instead
of mutating PERF_ENV_VARS; update the branch that checks hf_token to set values
on local_env and ensure any code that previously referenced PERF_ENV_VARS in
this execution uses local_env (referencing PERF_ENV_VARS and hf_token to locate
the change).
malay-nagda
left a comment
There was a problem hiding this comment.
Thanks for this fix!
|
/ok to test 17b4744 |
…DIA-NeMo#2086) Signed-off-by: Alex Filby <afilby@nvidia.com> Signed-off-by: conver334 <conver334@gmail.com>
This makes it consistent with the TRANSFORMER_OFFLINE variable.
Many mbridge recipes require hf data for either config information and/or tokenizer. This is causing many issues both internally and at customer sites as the HF connections have a tendency to get rate limited especially when launching many jobs.
To enable 'offline mode', we need 'HF_HUB_OFFLINE=1' and the necessary files in the cache locally. We've aligned 'HF_HUB_OFFLINE' to operate like TRANFORMERS_OFFLINE. Default to offline unless '--hf_token' is specified. Not all models require an hf_token to access however the rate limits are strict enough that it is worth enforcing.
Summary by CodeRabbit
Release Notes
Bug Fixes
Chores
✏️ Tip: You can customize this high-level summary in your review settings.