Skip to content

generation.py to respect separate server type for the client#1135

Merged
Jorjeous merged 5 commits intomainfrom
vmendelev/2512_support_separate_server_type_for_client
Jan 9, 2026
Merged

generation.py to respect separate server type for the client#1135
Jorjeous merged 5 commits intomainfrom
vmendelev/2512_support_separate_server_type_for_client

Conversation

@vmendelev
Copy link
Collaborator

@vmendelev vmendelev commented Dec 19, 2025

This is needed when we want to untie model used by a client from the server type.

Summary by CodeRabbit

  • Bug Fixes
    • Server configuration now respects user-specified settings. The system detects when users provide custom server type configurations and preserves those overrides instead of unconditionally applying defaults, for both hosted and remote deployment scenarios.

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Dec 19, 2025

📝 Walkthrough

Walkthrough

The change adds detection of whether the user specified ++server.server_type in extra_arguments, then conditionally appends the server_type argument only when the user did not provide one. This applies to both hosted and remote model hosting scenarios.

Changes

Cohort / File(s) Summary
Server Type Argument Handling
nemo_skills/pipeline/utils/generation.py
Adds conditional inclusion of ++server.server_type argument based on detection of user-provided values to prevent unintended argument overrides in both hosting and non-hosting scenarios.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~8 minutes

  • Verify the user input detection correctly identifies when ++server.server_type is already present in extra_arguments
  • Confirm the conditional logic functions correctly in both the hosting (server_gpus truthy) and non-hosting code paths
  • Ensure the change preserves user-specified overrides while providing default values when needed

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main change: making generation.py respect a separate server type for the client, which aligns with the PR's objective to allow the model used by a client to be untied from the server type.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch vmendelev/2512_support_separate_server_type_for_client

📜 Recent review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 7205c43 and 6d2e2ba.

📒 Files selected for processing (1)
  • nemo_skills/pipeline/utils/generation.py (2 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: pre-commit
  • GitHub Check: unit-tests
🔇 Additional comments (3)
nemo_skills/pipeline/utils/generation.py (3)

468-473: LGTM! Conditional server_type injection works correctly.

The implementation properly respects user-specified server_type while maintaining backward compatibility. The string construction handles spacing correctly in both cases (when server_type_arg is empty or populated).


476-480: LGTM! Consistent conditional logic for remote model hosting.

The implementation mirrors the hosted model path, ensuring consistent behavior across both scenarios. User-specified server_type is properly respected in the remote hosting case as well.


449-450: Good addition to support user overrides.

The detection logic correctly identifies when users specify their own server_type using standard Hydra syntax (++server.server_type=value). The simple string matching approach is appropriate and properly applied in both the hosted (lines 468-473) and remote (lines 476-480) code paths, allowing users to override the default server_type without conflicts.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Signed-off-by: Valentin Mendelev <vmendelev@nvidia.com>
@vmendelev vmendelev force-pushed the vmendelev/2512_support_separate_server_type_for_client branch from 6d2e2ba to e81e06a Compare December 19, 2025 17:17
Copy link
Collaborator

@gwarmstrong gwarmstrong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think it could make more sense to swap the order of extra args and the configured overrides here? It might simplify this code and result in an overall more predictable experience.
e.g., change

f"{extra_arguments} ++server.server_type={server_type} ++server.host=127.0.0.1 "
f"++server.port={server_port} ++server.model={model} "

to

f"++server.server_type={server_type} ++server.host=127.0.0.1 "
f"++server.port={server_port} ++server.model={model} {extra_arguments} "

@vmendelev
Copy link
Collaborator Author

Can do this. Do you know how the arguments are prioritized when they are having same name? Is the thing fails or last/first takes priority?

Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com>
@greptile-apps
Copy link
Contributor

greptile-apps bot commented Dec 30, 2025

Greptile Summary

The configure_client() function now checks if users already specified ++server.server_type= in extra_arguments before applying defaults. This allows users to override the server type configuration independently from the model path parameter.

Key Changes:

  • Added check at function start: only sets ++server.server_type={server_type} if not already present in extra_arguments
  • Reordered argument concatenation to prepend server_type_arg before user's extra_arguments (ensures user's settings take precedence if duplicated)
  • For both hosted and remote server paths, removed the unconditional ++server.server_type={server_type} appending that was overriding user configurations

Confidence Score: 4/5

  • This PR is safe to merge with minimal risk - it only modifies argument ordering logic without changing core functionality
  • The change correctly implements the stated goal of respecting user-specified server types. The logic is straightforward: check for existing ++server.server_type= in extra_arguments before setting defaults. The reordering of argument concatenation (prepending server_type_arg instead of appending) ensures user overrides take precedence. Minor confidence reduction due to lack of test coverage for this specific scenario.
  • No files require special attention

Important Files Changed

Filename Overview
nemo_skills/pipeline/utils/generation.py Respects user-specified server.server_type in extra_arguments by checking before setting default, avoiding override of custom configurations

Sequence Diagram

sequenceDiagram
    participant Caller as Caller (e.g., generate.py)
    participant ConfigClient as configure_client()
    participant ExtraArgs as extra_arguments

    Caller->>ConfigClient: Call with server_type, extra_arguments, etc.
    
    Note over ConfigClient: Check if ++server.server_type= exists in extra_arguments
    
    alt User specified server.server_type
        ConfigClient->>ExtraArgs: Prepend empty string (preserve user's override)
    else No user specification
        ConfigClient->>ExtraArgs: Prepend ++server.server_type={server_type}
    end
    
    alt server_gpus > 0 (hosted model)
        ConfigClient->>ExtraArgs: Append ++server.host, ++server.port, ++server.model
    else server_gpus == 0 (remote model)
        ConfigClient->>ExtraArgs: Append ++server.base_url, ++server.model
    end
    
    ConfigClient-->>Caller: Return (server_config, server_address, extra_arguments)
Loading

@greptile-apps
Copy link
Contributor

greptile-apps bot commented Dec 30, 2025

Greptile found no issues!

From now on, if a review finishes and we haven't found any issues, we will not post anything, but you can confirm that we reviewed your changes in the status check section.

This feature can be toggled off in your Code Review Settings by deselecting "Create a status check for each PR".

Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com>
Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com>
@Jorjeous Jorjeous enabled auto-merge (squash) January 9, 2026 13:21
Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Greptile Overview

Greptile Summary

The change allows users to specify ++server.server_type= in extra_arguments to override the default server type, addressing the use case of decoupling the client's server type from the model path parameter.

Key changes:

  • Added a check to detect if user already specified ++server.server_type= in extra_arguments
  • Reordered argument concatenation: user's extra_arguments now appear last (enabling override via Hydra's "last wins" semantics)
  • Removed duplicate ++server.server_type={server_type} appends from both hosted and remote server branches

Behavioral change: The argument reordering also allows users to override ++server.host, ++server.port, and ++server.model parameters, which was not possible before. This may be unintentional but is unlikely to cause issues in practice since users rarely specify these in extra_arguments.

Confidence Score: 4/5

  • This PR is safe to merge with low risk - it only modifies argument ordering logic
  • The implementation correctly achieves the stated goal of allowing server_type overrides. The string matching approach is pragmatic and unlikely to cause false positives. The argument reordering is a behavioral change but unlikely to cause issues since users rarely override host/port/model via extra_arguments
  • No files require special attention

Important Files Changed

File Analysis

Filename Score Overview
nemo_skills/pipeline/utils/generation.py 4/5 Modified configure_client() to allow user override of server_type via extra_arguments; reordered argument concatenation to enable user overrides

Sequence Diagram

sequenceDiagram
    participant User
    participant Pipeline
    participant configure_client
    participant Hydra

    User->>Pipeline: Call with --server_type=vllm --model=path/to/model
    Pipeline->>Pipeline: Parse arguments into extra_arguments
    
    alt User provides ++server.server_type in extra_arguments
        Pipeline->>configure_client: extra_arguments contains "++server.server_type="
        configure_client->>configure_client: Skip adding default server_type_arg
        configure_client->>configure_client: extra_arguments = "" + extra_arguments
    else User does not provide server_type override
        Pipeline->>configure_client: extra_arguments without server_type
        configure_client->>configure_client: Add server_type_arg with default value
        configure_client->>configure_client: extra_arguments = "++server.server_type={server_type} " + extra_arguments
    end
    
    alt server_gpus is set (hosted model)
        configure_client->>configure_client: Build: "++server.host=... ++server.port=... ++server.model=... {extra_arguments}"
        Note over configure_client: User's args at END (can override defaults)
    else Remote server
        configure_client->>configure_client: Build: "++server.base_url=... ++server.model=... {extra_arguments}"
        Note over configure_client: User's args at END (can override defaults)
    end
    
    configure_client->>Pipeline: Return (server_config, server_address, extra_arguments)
    Pipeline->>Hydra: Execute with final extra_arguments
    Note over Hydra: Last occurrence of each param wins
Loading

@Jorjeous Jorjeous merged commit bf943b3 into main Jan 9, 2026
5 checks passed
@Jorjeous Jorjeous deleted the vmendelev/2512_support_separate_server_type_for_client branch January 9, 2026 13:37
hsiehjackson pushed a commit that referenced this pull request Jan 13, 2026
Signed-off-by: Valentin Mendelev <vmendelev@nvidia.com>
Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com>
Co-authored-by: Nikolay Karpov <nkarpov@nvidia.com>
Signed-off-by: Cheng-Ping Hsieh <chsieh@nvidia.com>
arnavkomaragiri pushed a commit that referenced this pull request Jan 13, 2026
Signed-off-by: Valentin Mendelev <vmendelev@nvidia.com>
Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com>
Co-authored-by: Nikolay Karpov <nkarpov@nvidia.com>
Signed-off-by: Arnav Komaragiri <akomaragiri@nvidia.com>
dgtm777 pushed a commit that referenced this pull request Mar 18, 2026
Signed-off-by: Valentin Mendelev <vmendelev@nvidia.com>
Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com>
Co-authored-by: Nikolay Karpov <nkarpov@nvidia.com>
dgtm777 pushed a commit that referenced this pull request Mar 18, 2026
Signed-off-by: Valentin Mendelev <vmendelev@nvidia.com>
Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com>
Co-authored-by: Nikolay Karpov <nkarpov@nvidia.com>
Signed-off-by: dgitman <dgitman@nvidia.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants