generation.py to respect separate server type for the client#1135
generation.py to respect separate server type for the client#1135
Conversation
📝 WalkthroughWalkthroughThe change adds detection of whether the user specified Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~8 minutes
Pre-merge checks and finishing touches✅ Passed checks (3 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
📜 Recent review detailsConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro 📒 Files selected for processing (1)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
🔇 Additional comments (3)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Signed-off-by: Valentin Mendelev <vmendelev@nvidia.com>
6d2e2ba to
e81e06a
Compare
gwarmstrong
left a comment
There was a problem hiding this comment.
Do you think it could make more sense to swap the order of extra args and the configured overrides here? It might simplify this code and result in an overall more predictable experience.
e.g., change
f"{extra_arguments} ++server.server_type={server_type} ++server.host=127.0.0.1 "
f"++server.port={server_port} ++server.model={model} "
to
f"++server.server_type={server_type} ++server.host=127.0.0.1 "
f"++server.port={server_port} ++server.model={model} {extra_arguments} "
|
Can do this. Do you know how the arguments are prioritized when they are having same name? Is the thing fails or last/first takes priority? |
Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com>
Greptile SummaryThe Key Changes:
Confidence Score: 4/5
Important Files Changed
Sequence DiagramsequenceDiagram
participant Caller as Caller (e.g., generate.py)
participant ConfigClient as configure_client()
participant ExtraArgs as extra_arguments
Caller->>ConfigClient: Call with server_type, extra_arguments, etc.
Note over ConfigClient: Check if ++server.server_type= exists in extra_arguments
alt User specified server.server_type
ConfigClient->>ExtraArgs: Prepend empty string (preserve user's override)
else No user specification
ConfigClient->>ExtraArgs: Prepend ++server.server_type={server_type}
end
alt server_gpus > 0 (hosted model)
ConfigClient->>ExtraArgs: Append ++server.host, ++server.port, ++server.model
else server_gpus == 0 (remote model)
ConfigClient->>ExtraArgs: Append ++server.base_url, ++server.model
end
ConfigClient-->>Caller: Return (server_config, server_address, extra_arguments)
|
Greptile found no issues!From now on, if a review finishes and we haven't found any issues, we will not post anything, but you can confirm that we reviewed your changes in the status check section. This feature can be toggled off in your Code Review Settings by deselecting "Create a status check for each PR". |
Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com>
Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com>
There was a problem hiding this comment.
Greptile Overview
Greptile Summary
The change allows users to specify ++server.server_type= in extra_arguments to override the default server type, addressing the use case of decoupling the client's server type from the model path parameter.
Key changes:
- Added a check to detect if user already specified
++server.server_type=inextra_arguments - Reordered argument concatenation: user's
extra_argumentsnow appear last (enabling override via Hydra's "last wins" semantics) - Removed duplicate
++server.server_type={server_type}appends from both hosted and remote server branches
Behavioral change: The argument reordering also allows users to override ++server.host, ++server.port, and ++server.model parameters, which was not possible before. This may be unintentional but is unlikely to cause issues in practice since users rarely specify these in extra_arguments.
Confidence Score: 4/5
- This PR is safe to merge with low risk - it only modifies argument ordering logic
- The implementation correctly achieves the stated goal of allowing server_type overrides. The string matching approach is pragmatic and unlikely to cause false positives. The argument reordering is a behavioral change but unlikely to cause issues since users rarely override host/port/model via extra_arguments
- No files require special attention
Important Files Changed
File Analysis
| Filename | Score | Overview |
|---|---|---|
| nemo_skills/pipeline/utils/generation.py | 4/5 | Modified configure_client() to allow user override of server_type via extra_arguments; reordered argument concatenation to enable user overrides |
Sequence Diagram
sequenceDiagram
participant User
participant Pipeline
participant configure_client
participant Hydra
User->>Pipeline: Call with --server_type=vllm --model=path/to/model
Pipeline->>Pipeline: Parse arguments into extra_arguments
alt User provides ++server.server_type in extra_arguments
Pipeline->>configure_client: extra_arguments contains "++server.server_type="
configure_client->>configure_client: Skip adding default server_type_arg
configure_client->>configure_client: extra_arguments = "" + extra_arguments
else User does not provide server_type override
Pipeline->>configure_client: extra_arguments without server_type
configure_client->>configure_client: Add server_type_arg with default value
configure_client->>configure_client: extra_arguments = "++server.server_type={server_type} " + extra_arguments
end
alt server_gpus is set (hosted model)
configure_client->>configure_client: Build: "++server.host=... ++server.port=... ++server.model=... {extra_arguments}"
Note over configure_client: User's args at END (can override defaults)
else Remote server
configure_client->>configure_client: Build: "++server.base_url=... ++server.model=... {extra_arguments}"
Note over configure_client: User's args at END (can override defaults)
end
configure_client->>Pipeline: Return (server_config, server_address, extra_arguments)
Pipeline->>Hydra: Execute with final extra_arguments
Note over Hydra: Last occurrence of each param wins
Signed-off-by: Valentin Mendelev <vmendelev@nvidia.com> Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> Co-authored-by: Nikolay Karpov <nkarpov@nvidia.com> Signed-off-by: Cheng-Ping Hsieh <chsieh@nvidia.com>
Signed-off-by: Valentin Mendelev <vmendelev@nvidia.com> Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> Co-authored-by: Nikolay Karpov <nkarpov@nvidia.com> Signed-off-by: Arnav Komaragiri <akomaragiri@nvidia.com>
Signed-off-by: Valentin Mendelev <vmendelev@nvidia.com> Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> Co-authored-by: Nikolay Karpov <nkarpov@nvidia.com>
Signed-off-by: Valentin Mendelev <vmendelev@nvidia.com> Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> Co-authored-by: Nikolay Karpov <nkarpov@nvidia.com> Signed-off-by: dgitman <dgitman@nvidia.com>
This is needed when we want to untie model used by a client from the server type.
Summary by CodeRabbit
✏️ Tip: You can customize this high-level summary in your review settings.