Use run.Script for generate pipeline#1052
Conversation
Signed-off-by: George Armstrong <georgea@nvidia.com>
Signed-off-by: George Armstrong <georgea@nvidia.com>
Signed-off-by: George Armstrong <georgea@nvidia.com>
Signed-off-by: George Armstrong <georgea@nvidia.com>
Signed-off-by: George Armstrong <georgea@nvidia.com>
Signed-off-by: George Armstrong <georgea@nvidia.com>
Signed-off-by: George Armstrong <georgea@nvidia.com>
Signed-off-by: George Armstrong <georgea@nvidia.com>
Signed-off-by: George Armstrong <georgea@nvidia.com>
Signed-off-by: George Armstrong <georgea@nvidia.com>
Signed-off-by: George Armstrong <georgea@nvidia.com>
Signed-off-by: George Armstrong <georgea@nvidia.com>
Signed-off-by: George Armstrong <georgea@nvidia.com>
Signed-off-by: George Armstrong <georgea@nvidia.com>
Signed-off-by: George Armstrong <georgea@nvidia.com>
…actor-generate-run-script
Signed-off-by: George Armstrong <georgea@nvidia.com>
Signed-off-by: George Armstrong <georgea@nvidia.com>
Signed-off-by: George Armstrong <georgea@nvidia.com>
Signed-off-by: George Armstrong <georgea@nvidia.com>
📝 WalkthroughWalkthroughThis PR introduces multi-model generation support and refactors command execution from string-based to typed Script objects. The Changes
Sequence DiagramsequenceDiagram
actor User
participant Gen as generate()
participant Unify as _create_job_unified()
participant Scripts as Script Factory
participant Exec as Executor
User->>Gen: Call with models[], server_types[], GPUs[]
Gen->>Gen: Normalize per-model parameters
Gen->>Unify: Pass normalized models + configs
Unify->>Scripts: Create ServerScript per model
activate Scripts
Scripts->>Scripts: Allocate ports
Scripts->>Scripts: Build server commands
deactivate Scripts
Unify->>Scripts: Create SandboxScript (optional)
activate Scripts
Scripts->>Scripts: Allocate sandbox port
deactivate Scripts
Unify->>Scripts: Create GenerationClientScript
activate Scripts
Scripts->>Scripts: Wire server references
Scripts->>Scripts: Build lazy client command
deactivate Scripts
Unify->>Unify: Aggregate into CommandGroups[]
Unify-->>Gen: Return groups list
Gen->>Exec: Execute CommandGroups
activate Exec
Exec->>Scripts: Resolve script.inline (runtime)
Scripts-->>Scripts: Substitute hostnames for het jobs
Scripts-->>Exec: Final command
deactivate Exec
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes
Possibly related PRs
Suggested reviewers
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (8)
tests/test_generation.py (1)
22-25: Newtest_server_metadata_from_num_taskscorrectly validates Script-based server wiringThe test’s expectations on
_create_job_unified(first command being aServerScript,num_tasks >= 1, and group hardware mirroringserver_config["num_gpus"]) match the new Script-based orchestration and will catch regressions in server-side task accounting. Use of fixed/tmp/out//tmp/logsis acceptable here since the test doesn’t actually execute jobs, but could be switched totmp_path-derived locations if you ever see interference on shared CI hosts.Also applies to: 156-194
nemo_skills/pipeline/utils/generation.py (1)
360-435: Multi-model argument handling inget_generation_cmdcould validate list lengthsThe multi-model branch assumes
server_typesandserver_addressesare present and aligned withmodel_names:if server_addresses is not None and model_names is not None: num_models = len(model_names) if num_models > 1: model_names_arg = ",".join(model_names) server_types_arg = ",".join(server_types) server_addresses_arg = ",".join(server_addresses)Given callers already normalize lists, this is likely fine, but a defensive assertion (e.g., checking
len(server_types) == len(model_names) == len(server_addresses)) would make this helper safer when used outside the current generate path.nemo_skills/pipeline/generate.py (3)
232-271: Typer parameterization for multi-model arguments is reasonable, but docstring slightly overpromises Python typesThe CLI-facing
List[...]typer.Optionparameters formodel,server_address,server_type, etc., align with Typer’s multi-value pattern. Internally you normalize vianormalize_models_config/normalize_parameter, which supports both scalars and lists.Just note that the function annotations still show
List[...], while the docstring promises Python callers can pass scalars. That’s true at runtime, but slightly mismatched type-wise; if you add type checking later you may wantmodel: Anyplus explicit validation, or overloads.
551-586:generation_paramsdict is a good abstraction point, but contains some implicit invariantsPacking all generation-related knobs plus multi-model fields into
generation_paramsand passing it to_create_job_unifiedkeeps the job builder decoupled from the CLI layer, which is nice.Two small considerations:
generation_params["output_dir"]and["script"]are required forGenerationClientScriptbut not enforced here; if_create_job_unifiedis ever reused elsewhere, a light assertion docstring or validation would help avoid KeyErrors.- The comment about multi-model groups (“one per model + one for client”) is slightly misleading: currently
_create_job_unifiedreturns at most one group per model, with the client in group 0; there is no dedicated “client only” group.
50-201: The main servers-list issue has been fixed; verify test coverage for mixed hosting remains the secondary concernGood news: The handling of
Noneentries in the servers list has been corrected.GenerationClientScript.__post_init__()now properly iterates through servers with enumerate, using the same index to access bothself.serversandself.server_addresses_prehosted. This preserves the intended server-to-model alignment in mixed hosting scenarios (e.g.,[host_server_1, None, host_server_3]with corresponding pre-hosted addresses).Remaining observations:
Test coverage gap
tests/test_generation.pyonly tests single-model scenarios. Multi-model runs with mixed hosting (some self-hosted, some pre-hosted) lack explicit test cases. Consider adding a test for this scenario to prevent regressions.Minor improvements
zip(models, server_configs)(line 94) could addstrict=Truefor Python 3.10+ to catch length mismatches earlynum_tasksfallback to1for client-only groups is reasonable but document ifGenerationClientScriptmay later override thisnemo_skills/pipeline/utils/declarative.py (3)
210-257:Commandnow correctly encapsulates lazy Script evaluation, with one minor unused-arg nit
Command.prepare_for_execution:
- Evaluates
script.inlinewhen it is callable, allowing(cmd, metadata)tuples for env injection.- Uses
set_inlineto update the Script, which matches the new BaseJobScript pattern.- Builds a minimal
execution_config(log_prefix, environment, mounts placeholder, container) and returns(script, execution_config).Two minor notes:
- The
cluster_configparameter is currently unused in this method; if you don’t plan to use it, you can drop it or add a brief comment indicating it’s reserved for future use (to quiet linters).- This method assumes every
scripthas aset_inlinemethod; that’s true for the new Script types andDummyScript, but third-partyrun.Scriptsubclasses must expose it as well. If your public API allows arbitraryrun.Scriptinstances, consider ahasattr(self.script, "set_inline")guard with a fallback to plainscript.inline = ....
495-535:_prepare_commandand_rewrite_local_pathscorrectly adapt Scripts for local executionKey aspects:
_prepare_commandusesCommand.prepare_for_execution, and forexecutor in ("none", "local")rewrites/nemo_run/code/...paths to local repo paths whenget_registered_external_repo("nemo_skills")is configured._rewrite_local_pathssupports both string and callableinlinevalues, transparently rewriting commands while preserving optional(cmd, metadata)tuple semantics.This is a sensible way to keep local runs working with packaged-code assumptions. Just be aware that
_rewrite_local_pathsimplicitly assumes the Nemo repo is registered under the"nemo_skills"key; if that mapping isn’t present, it silently no-ops, which seems fine but might deserve a one-line doc comment.
586-778:_plan_and_add_jobis the core of the refactor and looks mostly solidStrengths:
- Assigns
script.het_group_indexbefore any evaluation so hostname-based cross-references see the right indices.- Prepares all commands first (collecting
script+exec_config) and only then constructs executors, letting you:
- Share environment vars across heterogeneous groups (
shared_env_vars)- Share packager across components for single-group jobs
- Always uses
group.namefor the underlying SLURM job name, while per-componentcommand.namefeeds log prefixes, simplifying mental mapping.- Keeps code-reuse logic (via
REUSE_CODE_EXPandget_packaging_job_key) constrained to non-heterogeneous, non-noneexecutors.Two behavioural subtleties to watch:
het_group_indicescontent
You appendhet_idxper executor, soexecutors[0].het_group_indiceswill contain one entry per script, not per group. If nemo_run’s heterogeneous job implementation expects unique group indices only once each, this might differ from previous behaviour. It’s likely benign, but worth verifying against nemo_run expectations.Environment merging for heterogeneous jobs
For heterogeneous jobs, you mergeshared_env_varsback into eachexec_config["environment"]. That’s good for consistency but means later commands can overwrite earlier env keys. If multiple groups intentionally set conflicting env vars, the last-wins semantics could be surprising; if that’s not a concern in current use cases, you can leave as-is.Overall, this function is well-structured and matches the new Script abstraction.
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (9)
nemo_skills/pipeline/generate.py(9 hunks)nemo_skills/pipeline/nemo_evaluator.py(6 hunks)nemo_skills/pipeline/utils/__init__.py(1 hunks)nemo_skills/pipeline/utils/declarative.py(11 hunks)nemo_skills/pipeline/utils/generation.py(5 hunks)nemo_skills/pipeline/utils/scripts.py(1 hunks)tests/test_declarative_pipeline.py(25 hunks)tests/test_generation.py(2 hunks)tests/test_nemo_evaluator_pipeline.py(5 hunks)
🧰 Additional context used
🧬 Code graph analysis (6)
nemo_skills/pipeline/utils/__init__.py (1)
nemo_skills/pipeline/utils/generation.py (2)
normalize_models_config(30-59)normalize_parameter(62-102)
tests/test_generation.py (2)
nemo_skills/pipeline/generate.py (2)
generate(206-637)_create_job_unified(50-201)nemo_skills/pipeline/utils/scripts.py (1)
ServerScript(120-214)
tests/test_nemo_evaluator_pipeline.py (3)
nemo_skills/pipeline/nemo_evaluator.py (2)
nemo_evaluator(113-421)EvaluatorClientScript(726-775)nemo_skills/pipeline/utils/declarative.py (2)
Command(211-259)CommandGroup(273-286)nemo_skills/pipeline/utils/scripts.py (1)
ServerScript(120-214)
nemo_skills/pipeline/utils/scripts.py (4)
nemo_skills/pipeline/utils/commands.py (1)
sandbox_command(77-111)nemo_skills/pipeline/utils/exp.py (1)
install_packages_wrap(368-408)nemo_skills/pipeline/utils/generation.py (1)
get_generation_cmd(360-495)nemo_skills/pipeline/utils/server.py (2)
get_free_port(43-59)get_server_command(114-227)
tests/test_declarative_pipeline.py (1)
nemo_skills/pipeline/utils/declarative.py (6)
Command(211-259)prepare_for_execution(223-256)get_name(258-259)CommandGroup(273-286)Pipeline(289-820)run(356-493)
nemo_skills/pipeline/utils/declarative.py (6)
nemo_skills/pipeline/utils/cluster.py (1)
get_env_variables(163-276)nemo_skills/pipeline/utils/packager.py (1)
get_registered_external_repo(64-76)nemo_skills/pipeline/utils/server.py (1)
wrap_python_path(66-67)nemo_skills/utils.py (1)
get_logger_name(39-43)tests/test_declarative_pipeline.py (1)
set_inline(39-40)nemo_skills/pipeline/utils/scripts.py (2)
set_inline(98-100)wrapped_inline(87-92)
🪛 Ruff (0.14.7)
tests/test_generation.py
174-174: Probable insecure usage of temporary file or directory: "/tmp/out"
(S108)
186-186: Probable insecure usage of temporary file or directory: "/tmp/logs"
(S108)
nemo_skills/pipeline/generate.py
94-94: Loop control variable model_path not used within loop body
Rename unused model_path to _model_path
(B007)
94-94: zip() without an explicit strict= parameter
Add explicit value for parameter strict=
(B905)
232-236: Do not perform function call typer.Option in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable
(B008)
237-241: Do not perform function call typer.Option in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable
(B008)
242-246: Do not perform function call typer.Option in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable
(B008)
247-251: Do not perform function call typer.Option in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable
(B008)
252-256: Do not perform function call typer.Option in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable
(B008)
257-261: Do not perform function call typer.Option in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable
(B008)
262-266: Do not perform function call typer.Option in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable
(B008)
415-417: Avoid specifying long messages outside the exception class
(TRY003)
nemo_skills/pipeline/utils/generation.py
50-50: Avoid specifying long messages outside the exception class
(TRY003)
58-58: Avoid specifying long messages outside the exception class
(TRY003)
99-102: Avoid specifying long messages outside the exception class
(TRY003)
tests/test_declarative_pipeline.py
587-587: Probable insecure usage of temporary file or directory: "/tmp/logs"
(S108)
590-590: Probable insecure usage of temporary file or directory: "/tmp/logs"
(S108)
676-676: Probable insecure usage of temporary file or directory: "/tmp/logs"
(S108)
nemo_skills/pipeline/utils/declarative.py
223-223: Unused method argument: cluster_config
(ARG002)
🔇 Additional comments (28)
nemo_skills/pipeline/utils/__init__.py (1)
47-55: Re-exporting normalization helpers looks consistent with API surfaceExposing
normalize_models_configandnormalize_parameterviapipeline.utilskeeps the public API coherent with how other generation utilities are surfaced. No issues from this change alone.nemo_skills/pipeline/utils/generation.py (1)
20-102: Normalization helpers are correct and robust for CLI + Python usageBoth
normalize_models_configandnormalize_parameterimplement the intended broadcast semantics cleanly and should behave well for:
- CLI (Typer) lists
- Python scalars or lists
The error messaging on mismatched lengths is also clear. No functional issues spotted.tests/test_nemo_evaluator_pipeline.py (2)
20-28: Evaluator pipeline tests align with Script-based designThe updated tests:
- Import and assert on
EvaluatorClientScript/ServerScript- Check client command naming (
evaluator-test-client-0...), servernum_gpus,log_prefix, and non-Noneports- Verify hardware allocation is driven by hosted server GPU counts
This gives good coverage of the new Nemo-evaluator orchestration without over-specifying internal details. Looks solid.
Also applies to: 95-204
206-327: Judge-server hosted & dual-server tests correctly exercise grouping and Script wiringThe
test_judge_server_hostedandtest_both_servers_hosted_separate_groupscases validate:
- Judge server uses
ServerScriptwithlog_prefix == "judge-server"and correctnum_gpus- Client always uses
EvaluatorClientScriptwhoseinlineis callable (lazy build)- Hardware per-group matches server GPU/node counts
This aligns well with the documented grouping strategy (main+client vs judge-only groups).
nemo_skills/pipeline/nemo_evaluator.py (4)
90-188: High-levelnemo_evaluatorflow is clear and consistent with the new Script modelThe command docstring and option set cleanly describe the four hosting scenarios, and the function now:
- Builds a
_TaskCreationContextper task- Routes through
_build_main_server_if_needed,_build_judge_server_if_needed, and_build_client_command- Groups commands based on hosting strategy
No obvious logic errors in the main orchestration; the separation into helper functions improves readability.
573-647:_build_client_commandandEvaluatorClientScriptimplement clean runtime URL resolution
_build_client_commandnow creates aCommandpowered byEvaluatorClientScript, passing in optional main/judgeServerScriptreferences.EvaluatorClientScript.__post_init__:
- Computes main/judge URLs from hosted servers (
hostname_ref()+port) or external base URLs- Adds health-check waits via
get_server_wait_cmd- Delegates to
_build_task_cmdfor the actual Nemo-Evaluator command, injecting URL/model overrides- Returns
(final_cmd, {"environment": env_vars})forCommand.prepare_for_executionThis matches the new Script-based pipeline semantics and should work for all four hosting modes.
546-570: Hardware allocation helper for evaluator groups matches hosting semantics
_hardware_for_groupcleanly encapsulates SLURM-related fields, includingpartition,num_gpus,num_nodes, andsbatch_kwargswith QoS. Using it consistently for all evaluator groups keeps hardware decisions centralized.
430-497:_create_serving_command_objcorrectly wraps servers inServerScriptandCommandThe helper:
- Normalizes
server_type(warns on unsupported types)- Instantiates
ServerScriptwithnum_gpus,num_nodes, args, entrypoint, and port/allocate_port- Applies a judge-specific
log_prefixand chooses container fromcluster_config["containers"][stype]when not overridden- Returns a
Commandwith clear role-specific namestests/test_declarative_pipeline.py (8)
17-52:DummyScriptandmake_commandare well-designed test scaffoldingThe
DummyScriptstand-in (withinline,set_inline,log_prefix,het_group_index, andhostname_ref) plusmake_command()cleanly simulate realrun.Scriptinstances for unit tests. This keeps tests decoupled from the actual Script implementations while matching the newCommand(script=...)API.
54-107:TestCommandsuite thoroughly covers newCommandsemanticsThe tests validate:
- Basic construction (
name, default container,script.inline)prepare_for_executionwith inline strings and callable inlines (with and without metadata/environment)hostname_refbehaviour for default and heterogeneous casesget_name()accessThis gives good confidence that
Command.prepare_for_executionandDummyScript.hostname_refbehave as expected.
112-140: CommandGroup tests reflect the new Script-centric API correctlyThe
TestCommandGrouptests ensure:
- Basic grouping and default
HardwareConfig- Custom hardware propagation
log_dirhandlingAll use
make_commandand thus implicitly validate compatibility with the newCommandsignature.
145-237: Pipeline construction tests line up with validation changes and job specsThese tests cover:
- Single-job and multi-job pipelines
run_aftersemantics as string or list- Direct
cluster_configusage (no name-based resolution)- Required
jobsandnamefieldsThey match the updated
Pipeline._validatelogic and job schema and look correct.
242-344: Basic pipeline execution and HF_HOME validation remain correct with Script-based flowThe tests around
Pipeline.run:
- Confirm experiment creation and
exp.addcalls- Validate HF_HOME presence and mount checks occurring in
__init__rather thanrun()- Ensure executors are constructed for SLURM paths
The use of
make_command+DummyScriptkeeps them aligned with the new API while retaining the prior behavioural guarantees.
374-477: Het-group index tests validate new Script-level indexing semantics
TestHetGroupIndicesasserts:
- Non-heterogeneous jobs leave
het_group_indexasNoneand hostname resolves to localhost- Heterogeneous jobs assign indices per group (0, 1, …), and
hostname_ref()embeds the correct SLURM env vars- Indices are per-job, not global across the pipeline
These tests are an excellent fit for the new
_plan_and_add_jobhet-indexing strategy.
479-724: Dependency resolution tests still accurately describe internal vs external handlingThe updated dependency tests cover:
- Explicit
Nonedependencies- Pipeline-level
run_afterpropagating to jobs without their own deps- Multiple internal dependencies (job objects → handles)
- Separation of external experiment deps (via
get_exp_handles) from internal handlesThey align well with the logic in
Pipeline.runand_plan_and_add_job.
829-897: Sandbox environment propagation test is a good end-to-end check
test_generate_with_sandbox_passes_env_vars_correctlyasserts that:
temporary_env_updateis called with updates containingNEMO_SKILLS_SANDBOX_PORTon the client side- Env-patching paths in
generate()+ declarative Pipeline behave as expected with Script-based commandsThis is a valuable regression test for the sandbox refactor.
nemo_skills/pipeline/generate.py (4)
32-36: New Script imports are appropriate and localizedImporting
GenerationClientScript,SandboxScript, andServerScripthere is consistent with this module’s responsibility for building generation jobs. No issues.
382-412: Model and server parameter normalization logic is soundThe sequence:
models_list = normalize_models_config(model)- Convert
server_typeenums → strings- Broadcast
server_type,server_gpus,server_nodes,server_args,server_entrypoint,server_container, andserver_addresswithnormalize_parameter- Enforce that multi-model usage requires
generation_typeorgeneration_modulegives a coherent multi-model configuration story. The broadcasting semantics are clear and should behave well for both CLI and Python code.
502-541: Per-modelconfigure_clientloop correctly separates single vs multi-model server overridesThe inner loop:
- Calls
configure_clientper model to buildserver_configsand resolved addresses.- Uses
extra_arguments_originalonly for the first model, then:
- For single-model: captures
srv_extra_argsso that server config is expressed viaextra_arguments.- For multi-model: ignores
srv_extra_argsand leaves per-model server configuration toGenerationClientScript+get_generation_cmd.This avoids double-injecting per-model overrides and nicely preserves the single-model semantics.
603-637: Job spec construction for single vs multi-group is consistent with Pipeline expectationsUsing:
"groups": job_groupswhenlen(job_groups) > 1"group": job_groups[0]otherwiseensures
Pipeline.runcorrectly chooses between single-group and heterogeneous multi-group jobs. Naminginternal_job_namebased ontask_nameanddep_idxis also clear and keeps dependency wiring straightforward.nemo_skills/pipeline/utils/declarative.py (5)
15-42: Module-level refactor to Script-based execution is well-scopedSwitching this module to:
- Import
nemo_run as run- Pull in pipeline utilities (
get_env_variables,get_executor, etc.)- Use
get_registered_external_repo+wrap_python_pathfor local executionsets the stage for Script-centric orchestration without leaking responsibilities into callers. No issues at the import/architecture level.
262-270:HardwareConfigextension withnum_tasksaligns with executor needsAdding
num_taskstoHardwareConfigand defaulting it to1lets the executor distinguish between node count and tasks per node. This plays nicely withServerScript.num_tasksfeeding into_create_job_unified.
356-495: Pipeline validation and high-level run loop remain sound after the Script refactorThe
Pipeline.__init__andrunmethods still:
- Validate job specs and cluster_config upfront
- Enforce HF_HOME presence/mounting for non-
noneexecutors (unless explicitly skipped)- Cleanly separate internal (handles) vs external (experiment names → SLURM job IDs) dependencies
- Distinguish single-group vs multi-group jobs and defer to
_add_*_jobThis matches prior behaviour while delegating planning details to
_plan_and_add_job.
543-585: Executor creation now usesnum_tasksand supports a group-wide job nameThe updated
_create_executor:
- Maps
HardwareConfig.num_nodes/num_tasks/num_gpusto executor args.- Uses
job_name_override(currently the group name) as the SLURM job name, so all components in a group share a stable name without embedding role suffixes.- Wraps environment updates via
temporary_env_update, keeping env-setting logic centralized.This is a good match for the new Script-based design.
780-820: Single-group vs multi-group helper methods correctly delegate to_plan_and_add_job
_add_single_group_joband_add_multi_group_jobsimply wrap_plan_and_add_jobwithheterogeneous=False/True, which keeps the public behaviour unchanged while centralizing the complex logic. No issues here.nemo_skills/pipeline/utils/scripts.py (3)
62-117: LGTM - Solid base class design for heterogeneous job support.The
set_inlinepattern usingobject.__setattr__is an appropriate workaround for safely mutating frozen dataclass fields. Thehostname_ref()method correctly handles both hetjob and non-hetjob scenarios with a sensible localhost fallback.
217-272: LGTM - Good documentation of responsibility boundaries.The comment on line 265 clarifying that
keep_mountsis handled at the executor level is helpful for maintainability. The same port validation concern fromServerScriptapplies here whenallocate_port=Falseandport=None.
361-407: Well-designed lazy command building pattern.The closure-based lazy evaluation for
build_cmdis appropriate sincehostname_ref()depends onhet_group_indexbeing set at runtime by the pipeline. The environment variable plumbing for sandbox port communication follows the pattern documented insandbox_command.One minor observation: the docstring example on lines 318-325 shows
servers=[server1, server2, None]with parallelserver_addresses_prehosted=["", "", "https://api.openai.com"]. The empty strings for self-hosted servers work but could be slightly misleading—consider documenting that these values are ignored when the correspondingserversentry is notNone.
| # Allocate port if not provided | ||
| if self.port is None and self.allocate_port: | ||
| self.port = get_free_port(strategy="random") | ||
| LOG.debug(f"Allocated port {self.port} for {self.server_type} server") |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# First, let's find the ServerScript class definition and see port field constraints
rg -n "class ServerScript" --type pyRepository: NVIDIA-NeMo/Skills
Length of output: 138
🏁 Script executed:
# Search for get_server_command to understand how it handles port
rg -n "def get_server_command" --type py -A 15Repository: NVIDIA-NeMo/Skills
Length of output: 2561
🏁 Script executed:
# Check the ServerScript class fields and any defaults
sed -n '119,180p' nemo_skills/pipeline/utils/scripts.py | cat -nRepository: NVIDIA-NeMo/Skills
Length of output: 2828
🏁 Script executed:
# View the complete __post_init__ method
sed -n '180,210p' nemo_skills/pipeline/utils/scripts.py | cat -nRepository: NVIDIA-NeMo/Skills
Length of output: 1355
🏁 Script executed:
# Check where get_server_command is called with the port parameter
rg -n "get_server_command" nemo_skills/pipeline/utils/scripts.py -A 5 -B 5Repository: NVIDIA-NeMo/Skills
Length of output: 1406
🏁 Script executed:
# Check for usages of ServerScript with allocate_port=False
rg -n "ServerScript\|allocate_port" --type py -B 2 -A 2Repository: NVIDIA-NeMo/Skills
Length of output: 44
🏁 Script executed:
# Also check if there are any type hints or validation that we missed
rg -n "server_port.*None" nemo_skills/pipeline/utils/server.py -B 5 -A 5Repository: NVIDIA-NeMo/Skills
Length of output: 44
🏁 Script executed:
# Let's check the full signature and docstring of get_server_command to see if it handles None
sed -n '114,160p' nemo_skills/pipeline/utils/server.py | cat -nRepository: NVIDIA-NeMo/Skills
Length of output: 2325
Add validation to ensure port is set when allocate_port=False.
If allocate_port=False and port=None, the port remains None and is passed to get_server_command, which expects an int for the server_port parameter. This causes a runtime error.
Add validation after the port allocation block:
# Allocate port if not provided
if self.port is None and self.allocate_port:
self.port = get_free_port(strategy="random")
LOG.debug(f"Allocated port {self.port} for {self.server_type} server")
+ elif self.port is None:
+ raise ValueError("port must be provided when allocate_port=False")📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| # Allocate port if not provided | |
| if self.port is None and self.allocate_port: | |
| self.port = get_free_port(strategy="random") | |
| LOG.debug(f"Allocated port {self.port} for {self.server_type} server") | |
| # Allocate port if not provided | |
| if self.port is None and self.allocate_port: | |
| self.port = get_free_port(strategy="random") | |
| LOG.debug(f"Allocated port {self.port} for {self.server_type} server") | |
| elif self.port is None: | |
| raise ValueError("port must be provided when allocate_port=False") |
🤖 Prompt for AI Agents
In nemo_skills/pipeline/utils/scripts.py around lines 180 to 183, after the
port-allocation block add validation to ensure self.port is set when
allocate_port is False: if self.port is still None raise a clear exception
(e.g., ValueError) or log and exit, so get_server_command always receives an
int; implement the check immediately after the allocation block and include a
descriptive error message referencing server_type.
| if self.servers is not None: | ||
| server_addresses = [] | ||
| for server_idx, server_script in enumerate(self.servers): | ||
| if server_script is not None: | ||
| # Self-hosted: construct address from hostname and port refs | ||
| addr = f"{server_script.hostname_ref()}:{server_script.port}" | ||
| else: | ||
| # Pre-hosted: use the address from server_addresses_prehosted | ||
| addr = self.server_addresses_prehosted[server_idx] | ||
| server_addresses.append(addr) |
There was a problem hiding this comment.
Potential IndexError or TypeError when accessing server_addresses_prehosted.
If self.servers contains None entries (indicating pre-hosted servers), the code assumes self.server_addresses_prehosted is a list with matching indices. However, if server_addresses_prehosted is None or shorter than expected, this will raise an exception at runtime.
Consider adding validation:
# Build server addresses if servers are provided
server_addresses = None
if self.servers is not None:
+ # Validate parallel lists
+ if any(s is None for s in self.servers):
+ if self.server_addresses_prehosted is None:
+ raise ValueError(
+ "server_addresses_prehosted must be provided when servers contains None entries"
+ )
+ if len(self.server_addresses_prehosted) != len(self.servers):
+ raise ValueError(
+ "server_addresses_prehosted must have same length as servers"
+ )
server_addresses = []
for server_idx, server_script in enumerate(self.servers):Alternatively, move validation to __post_init__ outside the build_cmd closure to fail fast during construction.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| if self.servers is not None: | |
| server_addresses = [] | |
| for server_idx, server_script in enumerate(self.servers): | |
| if server_script is not None: | |
| # Self-hosted: construct address from hostname and port refs | |
| addr = f"{server_script.hostname_ref()}:{server_script.port}" | |
| else: | |
| # Pre-hosted: use the address from server_addresses_prehosted | |
| addr = self.server_addresses_prehosted[server_idx] | |
| server_addresses.append(addr) | |
| if self.servers is not None: | |
| # Validate parallel lists | |
| if any(s is None for s in self.servers): | |
| if self.server_addresses_prehosted is None: | |
| raise ValueError( | |
| "server_addresses_prehosted must be provided when servers contains None entries" | |
| ) | |
| if len(self.server_addresses_prehosted) != len(self.servers): | |
| raise ValueError( | |
| "server_addresses_prehosted must have same length as servers" | |
| ) | |
| server_addresses = [] | |
| for server_idx, server_script in enumerate(self.servers): | |
| if server_script is not None: | |
| # Self-hosted: construct address from hostname and port refs | |
| addr = f"{server_script.hostname_ref()}:{server_script.port}" | |
| else: | |
| # Pre-hosted: use the address from server_addresses_prehosted | |
| addr = self.server_addresses_prehosted[server_idx] | |
| server_addresses.append(addr) |
This reverts commit 9c5b68c.
This reverts commit 9c5b68c. Signed-off-by: George Armstrong <georgea@nvidia.com>
This reverts commit 1c0722a. FIX multi-node pipeline creation Signed-off-by: George Armstrong <georgea@nvidia.com> remove hosntame ref change Signed-off-by: George Armstrong <georgea@nvidia.com> make param span_group_nodes Signed-off-by: George Armstrong <georgea@nvidia.com>
Signed-off-by: George Armstrong <georgea@nvidia.com>
Signed-off-by: George Armstrong <georgea@nvidia.com> Signed-off-by: wasiahmad <wasiahmad@ucla.edu>
- Add nemo_skills/inference/hilbert.py: unified generation module that orchestrates prover (vLLM) + reasoner (Gemini) in single job using multi-model support from PR #1052 - Add hilbert_unified stage to stages.py for pipeline orchestration - Add tokens_to_generate param to hilbert_prover (default 5K for testing) - Add unified-local-test.yaml config for testing unified pipeline Pipeline flow: hilbert_d0 → split_d0 → hilbert_d1 → split_d1 → assemble 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> Signed-off-by: Stephen Ge <stepheng@nvidia.com>
Signed-off-by: George Armstrong <georgea@nvidia.com> Signed-off-by: dlord <dlord@nvidia.com>
…DIA-NeMo#1125) Signed-off-by: George Armstrong <georgea@nvidia.com> Signed-off-by: dlord <dlord@nvidia.com>
Signed-off-by: George Armstrong <georgea@nvidia.com> Signed-off-by: Cheng-Ping Hsieh <chsieh@nvidia.com>
Signed-off-by: George Armstrong <georgea@nvidia.com>
Signed-off-by: George Armstrong <georgea@nvidia.com>
Signed-off-by: George Armstrong <georgea@nvidia.com> Signed-off-by: dgitman <dgitman@nvidia.com>
Refactor Generate Declarative Pipeline to Scripts
Summary
ServerScript,SandboxScript,GenerationClientScript) with explicit fields, lazy command generation, and cross-component references.num_gpus,num_nodes,num_tasks) insideHardwareConfigand script objects; remove redundant command-level overrides.Key Changes
Declarative Core (
nemo_skills/pipeline/utils/declarative.py)Commandnow wraps arun.Scriptinstance; redundantgpus,nodes, andinstallation_commandfields were removed.num_tasks,num_gpus, andnum_nodescome fromHardwareConfigwhen creating executors.Generation Pipeline (
nemo_skills/pipeline/generate.py)_create_job_unified()now instantiates script objects, wires their cross-references (client ↔ server/sandbox), and adds them toCommandGroups.max()scans of individual commands.Summary by CodeRabbit
Release Notes
New Features
API Changes
model,server_address,server_type,server_gpus,server_nodes,server_args,server_entrypoint,server_container) now support list inputs for per-model configuration.✏️ Tip: You can customize this high-level summary in your review settings.