-
Notifications
You must be signed in to change notification settings - Fork 243
Clarify HF offline behavior with --offline flag #2847
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
sudostock
wants to merge
7
commits into
main
Choose a base branch
from
afilby/offline-mode-flag
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
+276
−12
Open
Changes from all commits
Commits
Show all changes
7 commits
Select commit
Hold shift + click to select a range
d210329
restore container writable and clarify HF offline behavior (#2672)
sudostock 4ad16c9
Add test for new flags.
sudostock 65ef0f4
Tighten offline mode tests and add default-behaviour coverage.
sudostock 904e9a6
[perf] refactor: move --hf_token/--offline mutual exclusion to argpar…
sudostock f27c1d4
Improve some of the readme and comment wording.
sudostock f6c23d0
Simplify unit tests.
sudostock ec09583
Copy PERF_ENV_VARS dict before mutating them.
sudostock File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
203 changes: 203 additions & 0 deletions
203
tests/unit_tests/scripts/test_performance_offline_mode.py
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,203 @@ | ||
| # Copyright (c) 2026, NVIDIA CORPORATION. All rights reserved. | ||
| # | ||
| # Licensed under the Apache License, Version 2.0 (the "License"); | ||
| # you may not use this file except in compliance with the License. | ||
| # You may obtain a copy of the License at | ||
| # | ||
| # http://www.apache.org/licenses/LICENSE-2.0 | ||
| # | ||
| # Unless required by applicable law or agreed to in writing, software | ||
| # distributed under the License is distributed on an "AS IS" BASIS, | ||
| # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
| # See the License for the specific language governing permissions and | ||
| # limitations under the License. | ||
|
|
||
| import importlib | ||
| import sys | ||
| import types | ||
| from types import SimpleNamespace | ||
| from unittest.mock import patch | ||
|
|
||
| import pytest | ||
|
|
||
|
|
||
| def _package_module(name: str) -> types.ModuleType: | ||
| """Create a minimal package-like module.""" | ||
| module = types.ModuleType(name) | ||
| module.__path__ = [] | ||
| return module | ||
|
|
||
|
|
||
| @pytest.fixture | ||
| def import_performance_module(): | ||
| """Import performance modules with only the nemo_run surface these tests need.""" | ||
| nemo_run = _package_module("nemo_run") | ||
| nemo_run_config = types.ModuleType("nemo_run.config") | ||
| nemo_run_core = _package_module("nemo_run.core") | ||
| nemo_run_core_execution = _package_module("nemo_run.core.execution") | ||
| nemo_run_core_execution_launcher = types.ModuleType("nemo_run.core.execution.launcher") | ||
|
|
||
| nemo_run.LocalTunnel = lambda **kwargs: SimpleNamespace(**kwargs) | ||
| nemo_run.GitArchivePackager = lambda **kwargs: SimpleNamespace(**kwargs) | ||
| nemo_run.SlurmExecutor = lambda **kwargs: SimpleNamespace(**kwargs) | ||
| nemo_run.DGXCloudExecutor = lambda **kwargs: SimpleNamespace(**kwargs) | ||
|
|
||
| nemo_run_config.get_nemorun_home = lambda: "/tmp/nemorun" | ||
| nemo_run_config.set_nemorun_home = lambda _path: None | ||
|
|
||
| nemo_run_core_execution_launcher.SlurmTemplate = lambda **kwargs: SimpleNamespace(**kwargs) | ||
|
|
||
| nemo_run.config = nemo_run_config | ||
| nemo_run.core = nemo_run_core | ||
| nemo_run_core.execution = nemo_run_core_execution | ||
| nemo_run_core_execution.launcher = nemo_run_core_execution_launcher | ||
|
|
||
| with patch.dict( | ||
| sys.modules, | ||
| { | ||
| "nemo_run": nemo_run, | ||
| "nemo_run.config": nemo_run_config, | ||
| "nemo_run.core": nemo_run_core, | ||
| "nemo_run.core.execution": nemo_run_core_execution, | ||
| "nemo_run.core.execution.launcher": nemo_run_core_execution_launcher, | ||
| }, | ||
| ): | ||
|
|
||
| def _import(module_name: str): | ||
| sys.modules.pop(module_name, None) | ||
| return importlib.import_module(module_name) | ||
|
|
||
| yield _import | ||
|
|
||
|
|
||
| def test_parse_cli_args_accepts_offline_flag(import_performance_module): | ||
| """The performance CLI should keep exposing the offline switch.""" | ||
| argument_parser = import_performance_module("scripts.performance.argument_parser") | ||
|
|
||
| parser = argument_parser.parse_cli_args() | ||
| args = parser.parse_args( | ||
| [ | ||
| "--model_family_name", | ||
| "llama", | ||
| "--model_recipe_name", | ||
| "llama3_8b", | ||
| "--num_gpus", | ||
| "8", | ||
| "--gpu", | ||
| "h100", | ||
| "--offline", | ||
| ] | ||
| ) | ||
|
|
||
| assert args.offline is True | ||
|
|
||
|
|
||
| def test_argparse_rejects_hf_token_with_offline(import_performance_module): | ||
| """argparse should reject --hf_token and --offline together at parse time.""" | ||
| argument_parser = import_performance_module("scripts.performance.argument_parser") | ||
|
|
||
| parser = argument_parser.parse_cli_args() | ||
| with pytest.raises(SystemExit): | ||
| parser.parse_args( | ||
| [ | ||
| "--model_family_name", | ||
| "llama", | ||
| "--model_recipe_name", | ||
| "llama3_8b", | ||
| "--num_gpus", | ||
| "8", | ||
| "--gpu", | ||
| "h100", | ||
| "--hf_token", | ||
| "hf_test_token", | ||
| "--offline", | ||
| ] | ||
| ) | ||
|
|
||
|
|
||
| def test_slurm_executor_sets_offline_env_and_container_writable(import_performance_module): | ||
| """Offline mode should set HF_HUB_OFFLINE and preserve the offline Transformers default.""" | ||
| executors = import_performance_module("scripts.performance.utils.executors") | ||
|
|
||
| executor = executors.slurm_executor( | ||
| gpu="h100", | ||
| account="test_account", | ||
| partition="test_partition", | ||
| log_dir="/tmp/log_dir", | ||
| nodes=1, | ||
| num_gpus_per_node=8, | ||
| offline=True, | ||
| ) | ||
|
|
||
| assert "--container-writable" in executor.srun_args | ||
| assert executor.env_vars["HF_HUB_OFFLINE"] == "1" | ||
| assert executor.env_vars["TRANSFORMERS_OFFLINE"] == "1" | ||
|
|
||
|
|
||
| def test_slurm_executor_default_has_container_writable_and_hub_online(import_performance_module): | ||
| """By default, --container-writable is always set and HF Hub access stays online.""" | ||
| executors = import_performance_module("scripts.performance.utils.executors") | ||
|
|
||
| executor = executors.slurm_executor( | ||
| gpu="h100", | ||
| account="test_account", | ||
| partition="test_partition", | ||
| log_dir="/tmp/log_dir", | ||
| nodes=1, | ||
| num_gpus_per_node=8, | ||
| ) | ||
|
|
||
| assert "--container-writable" in executor.srun_args | ||
| assert executor.env_vars["HF_HUB_OFFLINE"] == "0" | ||
| assert executor.env_vars["TRANSFORMERS_OFFLINE"] == "1" | ||
|
|
||
|
|
||
| def test_slurm_executor_hf_token_enables_online_transformers(import_performance_module): | ||
| """Providing an HF token should enable the online Transformers path.""" | ||
| executors = import_performance_module("scripts.performance.utils.executors") | ||
|
|
||
| executor = executors.slurm_executor( | ||
| gpu="h100", | ||
| account="test_account", | ||
| partition="test_partition", | ||
| log_dir="/tmp/log_dir", | ||
| nodes=1, | ||
| num_gpus_per_node=8, | ||
| hf_token="hf_test_token", | ||
| ) | ||
|
|
||
| assert executor.env_vars["HF_TOKEN"] == "hf_test_token" | ||
| assert executor.env_vars["TRANSFORMERS_OFFLINE"] == "0" | ||
| assert executor.env_vars["HF_HUB_OFFLINE"] == "0" | ||
|
|
||
|
|
||
| def test_slurm_executor_no_state_leakage_between_calls(import_performance_module): | ||
| """Calling slurm_executor twice must not leak env vars from the first call into the second.""" | ||
| executors = import_performance_module("scripts.performance.utils.executors") | ||
|
|
||
| # First call: with hf_token and wandb_key | ||
| first = executors.slurm_executor( | ||
| gpu="h100", | ||
| account="test_account", | ||
| partition="test_partition", | ||
| log_dir="/tmp/log_dir", | ||
| nodes=1, | ||
| num_gpus_per_node=8, | ||
| hf_token="hf_secret", | ||
| wandb_key="wandb_secret", | ||
| ) | ||
| assert first.env_vars["HF_TOKEN"] == "hf_secret" | ||
| assert first.env_vars["WANDB_API_KEY"] == "wandb_secret" | ||
|
|
||
| # Second call: no token, no wandb — should get clean defaults | ||
| second = executors.slurm_executor( | ||
| gpu="h100", | ||
| account="test_account", | ||
| partition="test_partition", | ||
| log_dir="/tmp/log_dir", | ||
| nodes=1, | ||
| num_gpus_per_node=8, | ||
| ) | ||
| assert "HF_TOKEN" not in second.env_vars | ||
| assert "WANDB_API_KEY" not in second.env_vars | ||
| assert second.env_vars["TRANSFORMERS_OFFLINE"] == "1" |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.