Skip to content

Conversation

atchernych
Copy link
Contributor

@atchernych atchernych commented Oct 10, 2025

Overview:

DEP-504

Details:

Where should the reviewer start?

Related Issues: (use one of the action keywords Closes / Fixes / Resolves / Relates to)

  • closes GitHub issue: #xxx

Summary by CodeRabbit

  • New Features

    • Added Dynamo-powered plugins: a pre-request worker ID injector and a KV-aware scorer to enhance routing and scoring capabilities.
  • Chores

    • Introduced a multi-stage Docker build with a minimal, non-root runtime including required libraries and certificates.
    • Updated build scripts and messaging to align with GAIE terminology, with validation for required Docker artifacts.
    • Bumped version to v0.5.1.

Signed-off-by: Anna Tchernych <[email protected]>
@atchernych atchernych requested review from a team as code owners October 10, 2025 19:05
@github-actions github-actions bot added the feat label Oct 10, 2025
Signed-off-by: Anna Tchernych <[email protected]>
Copy link
Contributor

coderabbitai bot commented Oct 10, 2025

Walkthrough

Introduces a multi-stage Dockerfile for building/running the GAIE EPP binary, updates the GAIE build script to use GAIE paths and copy a Dockerfile artifact, and applies a patch adding Dynamo-based pre-request and KV scorer plugins plus a version bump to v0.5.1.

Changes

Cohort / File(s) Summary of changes
Container build artifact
container/Dockerfile.epp
Adds multi-stage Dockerfile: Go-based builder compiles EPP with ldflags; runtime on minimal Ubuntu installs deps, sets nonroot user, copies /epp, sets entrypoint.
GAIE build script adjustments
deploy/inference-gateway/build-epp-dynamo.sh
Replaces EPP_DIR with GAIE_DIR; updates paths/messages; prepares dirs; copies headers, libs, and Dockerfile.epp; validates copied Dockerfile; applies patches in GAIE_DIR; updates build messages.
Dynamo plugins + version bump (patch)
deploy/inference-gateway/epp-patches/epp-v0.5.1-2/epp-v0.5.1-dyn2.patch
Adds public Dynamo pre-request plugin (inject worker ID) and KV-aware scorer plugin with factories and a TypedName method; updates BundleVersion to v0.5.1.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  participant CLI as Build Script (GAIE)
  participant Git as Patch Set
  participant Docker as Docker Builder
  participant Img as GAIE EPP Image

  CLI->>CLI: Prepare GAIE dirs (lib/include)
  CLI->>CLI: Copy headers, libs, Dockerfile.epp
  CLI->>Git: Apply epp-v0.5.1-dyn2.patch
  alt Patch applied
    CLI-->>CLI: Proceed
  else Patch already applied / skipped
    CLI-->>CLI: Continue without error
  end
  CLI->>Docker: docker build -f container/Dockerfile.epp
  Docker->>Docker: Build stage (golang 1.24) compile epp
  Docker->>Docker: Runtime stage (ubuntu) add deps, user
  Docker-->>Img: Produce GAIE EPP image
Loading
sequenceDiagram
  autonumber
  participant Client
  participant Gateway as Inference Gateway
  participant PreReq as Pre-Request Plugin (dynamo_inject_workerid)
  participant Router as Router
  participant Scorer as KV Scorer (dynamo_kv_scorer)
  participant Backend

  Client->>Gateway: Request
  Gateway->>PreReq: Execute pre-request
  note right of PreReq: Inject worker ID from Dynamo
  PreReq-->>Gateway: Mutated request
  Gateway->>Router: Route candidates
  Router->>Scorer: Score candidates (KV-aware)
  note right of Scorer: Uses KV signals for scoring
  Scorer-->>Router: Ranked candidates
  Router->>Backend: Dispatch
  Backend-->>Client: Response
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Poem

I thump my paws—new plugs in tow,
A Docker burrow built to go.
KV whispers guide the way,
Worker IDs join the fray.
Version hops to five-point-one,
GAIE’s gears now neatly run—
Carrots compiled, ship it, done! 🥕🚀

Pre-merge checks

❌ Failed checks (2 warnings)
Check name Status Explanation Resolution
Title Check ⚠️ Warning The provided title “feat: move the EPP build docker file” is too narrow and doesn’t capture the primary changes in this pull request, which include adding a multi-stage build Dockerfile, updating GAIE integration scripts, and introducing new Dynamo plugins. The wording suggests only a file move rather than a significant feature addition and refactoring. As a result it misrepresents the scope and intent of the changeset. Please update the title to concisely reflect the main feature and scope of the changes, for example “feat: introduce multi-stage GAIE Dockerfile build and Dynamo-based EPP plugins” so that it clearly summarizes the core updates.
Description Check ⚠️ Warning The pull request description contains only a placeholder overview and empty template sections, without any actual details about the changes made or guidance on where reviewers should start. It fails to describe the modifications to the Dockerfile, build scripts, patch files, and plugin code, leaving the reader without context. The related issues section also references a placeholder rather than a valid issue number. Please complete the template sections by providing a meaningful overview of the work, summarizing the specific changes in the “Details” section, indicating key files or components in “Where should the reviewer start?”, and verifying that the “Related Issues” entry references the actual issue number.
✅ Passed checks (1 passed)
Check name Status Explanation
Docstring Coverage ✅ Passed No functions found in the changes. Docstring coverage check skipped.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 7

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (3)
deploy/inference-gateway/epp-patches/epp-v0.5.1-2/epp-v0.5.1-dyn2.patch (2)

27-38: Dockerfile name mismatch: Makefile expects Dockerfile.dynamo but repo/script provide Dockerfile.epp

Target uses “-f Dockerfile.dynamo” while the artifact is Dockerfile.epp. Builds will fail.

Options:

  • Rename/copy the file as Dockerfile.dynamo, or
  • Change the Makefile target to use container/Dockerfile.epp, e.g.:
- $(IMAGE_BUILD_CMD) -f Dockerfile.dynamo -t $(IMAGE_TAG) \
+ $(IMAGE_BUILD_CMD) -f container/Dockerfile.epp -t $(IMAGE_TAG) \

Keep build script and Makefile consistent.


803-806: Per-request state stored on Server struct causes data races and cross-request leakage

workerIDHint/tokenDataHint live on Server and are mutated per request. Process can handle multiple streams concurrently; this invites races and wrong association across requests.

Refactor to store per-request state:

  • Keep hints in a per-stream/request context struct and pass to handler phases, or
  • Use gRPC stream context to carry values between header/body phases, or
  • Echo via ext_proc dynamic metadata between phases.
    Avoid shared mutable fields on Server for request-local data.

Also applies to: 765-792

deploy/inference-gateway/build-epp-dynamo.sh (1)

86-96: Patch path likely incorrect

Repo path shows deploy/inference-gateway/epp-patches/epp-v0.5.1-2/epp-v0.5.1-dyn2.patch but script uses epp-patches/v0.5.1-2/...

-PATCH_FILE="${DYNAMO_DIR}/deploy/inference-gateway/epp-patches/v0.5.1-2/epp-v0.5.1-dyn2.patch"
+PATCH_FILE="${DYNAMO_DIR}/deploy/inference-gateway/epp-patches/epp-v0.5.1-2/epp-v0.5.1-dyn2.patch"
🧹 Nitpick comments (3)
deploy/inference-gateway/epp-patches/epp-v0.5.1-2/epp-v0.5.1-dyn2.patch (2)

615-645: Base64 token_data decode silently ignores malformed inputs

Non-blocking, but consider emitting a debug log when decode/unmarshal fails to aid diagnosis.


172-174: Nit: flag name in error message

Message uses “model-server-metrics-scheme”; actual flag is “modelServerMetricsScheme”. Align for clarity.

- return fmt.Errorf("unexpected %q value for %q flag, it can only be set to 'http' or 'https'", *modelServerMetricsScheme, "model-server-metrics-scheme")
+ return fmt.Errorf("unexpected %q value for %q flag, it can only be set to 'http' or 'https'", *modelServerMetricsScheme, "modelServerMetricsScheme")
deploy/inference-gateway/build-epp-dynamo.sh (1)

17-17: Harden script error handling

Add -u and pipefail to catch unset vars and pipeline failures.

-set -e  # Exit on any error
+set -euo pipefail  # Exit on error, unset vars, and pipeline errors
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a90ada1 and 865c8ef.

📒 Files selected for processing (3)
  • container/Dockerfile.epp (1 hunks)
  • deploy/inference-gateway/build-epp-dynamo.sh (3 hunks)
  • deploy/inference-gateway/epp-patches/epp-v0.5.1-2/epp-v0.5.1-dyn2.patch (5 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
  • GitHub Check: trtllm (amd64)
  • GitHub Check: sglang
  • GitHub Check: vllm (amd64)
  • GitHub Check: Build and Test - dynamo
🔇 Additional comments (4)
container/Dockerfile.epp (1)

51-52: Verify ldflags package path for CommitSHA/BuildRef

The path points to pkg/epp/metrics. The PR changes show metrics code under pkg/epp/backend/metrics. Ensure the target variables actually exist to avoid “symbol not found” or unused ldflags.

If the variables live elsewhere, update the path accordingly.

deploy/inference-gateway/epp-patches/epp-v0.5.1-2/epp-v0.5.1-dyn2.patch (2)

1306-1324: KV router pipeline global: confirm thread-safety of C API

The pipeline is a shared global used under RLock; multiple concurrent Score calls can hit dynamo_query_worker_selection_and_annotate. Verify the C pipeline is thread-safe; otherwise, serialize calls or create per-goroutine instances.

If not thread-safe, protect with a mutex around the C call or implement a worker pool.

Also applies to: 1476-1527


949-970: Prefix indexer Get: LGTM

Deep-copying the pod set under RLock avoids races. Good fix.

deploy/inference-gateway/build-epp-dynamo.sh (1)

37-48: Validate the produced static lib name

Script copies libdynamo_llm_capi.a after building “-p libdynamo_llm”. Ensure the crate produces libdynamo_llm_capi.a (staticlib). If not, build the correct crate/package.

If needed:

-cargo build --release -p libdynamo_llm
+cargo build --release -p libdynamo_llm_capi

Or adjust the copied filename to match Cargo output.

Also applies to: 58-59

Comment on lines +15 to +17
ENV GOOS=linux
ENV GOARCH=amd64
# be explicit; helps cgo when linking libstdc++
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Hard-coded GOARCH breaks multi-arch builds with CGO

GOARCH=amd64 forces amd64 even when building for arm64 via buildx. With CGO, this will fail or produce wrong-arch binaries.

Apply one of these fixes:

  • Prefer: derive from BuildKit args.
+ARG TARGETOS
+ARG TARGETARCH
-ENV GOOS=linux
-ENV GOARCH=amd64
+ENV GOOS=${TARGETOS}
+ENV GOARCH=${TARGETARCH}
  • Or simplest: remove both GOOS/GOARCH lines and let buildx select per-platform images (builder is linux already).
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
ENV GOOS=linux
ENV GOARCH=amd64
# be explicit; helps cgo when linking libstdc++
ARG TARGETOS
ARG TARGETARCH
ENV GOOS=${TARGETOS}
ENV GOARCH=${TARGETARCH}
# be explicit; helps cgo when linking libstdc++
🤖 Prompt for AI Agents
In container/Dockerfile.epp around lines 15 to 17 the Dockerfile hard-sets ENV
GOOS=linux and ENV GOARCH=amd64 which forces amd64 output and breaks
multi-arch/CGO builds; either remove these ENV lines so buildx/BuildKit can pick
the platform automatically, or change them to derive from build-time args (use
ARG TARGETOS and ARG TARGETARCH, then set ENV GOOS=${TARGETOS} and ENV
GOARCH=${TARGETARCH}) so cross-builds produce correct-arch binaries.

Comment on lines +72 to +75
if [[ ! -f "${GAIE_DIR}/Dockerfile.epp" ]]; then
echo "Docker.epp file copy failed!"
exit 1
fi
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Makefile/Dockerfile mismatch will break image build

  • Script copies Dockerfile.epp; Makefile target uses Dockerfile.dynamo.

Option A (script-side fix):

-cp "${DYNAMO_DIR}/container/Dockerfile.epp" "${GAIE_DIR}"
+cp "${DYNAMO_DIR}/container/Dockerfile.epp" "${GAIE_DIR}/Dockerfile.dynamo"
@@
-echo "Building the custom EPP image for GAIE..."
+echo "Building the custom EPP image for GAIE..."
 make dynamo-image-local-load

Option B: adjust Makefile to use container/Dockerfile.epp (see prior comment).
Ensure the existence check matches the chosen name.

Also applies to: 99-101

🤖 Prompt for AI Agents
In deploy/inference-gateway/build-epp-dynamo.sh around lines 72-75 and 99-101
the script checks for Dockerfile.epp while the Makefile target expects
Dockerfile.dynamo; make the names consistent by either (A) changing the script
to use Dockerfile.dynamo everywhere (update copy commands, existence checks and
error messages to Dockerfile.dynamo) or (B) update the Makefile to reference
container/Dockerfile.epp; ensure the chosen filename is used in all copy/exists
checks and error outputs so the image build cannot fail due to a naming
mismatch.

Comment on lines +100 to +103
+ modelServerMetricsPath = flag.String("modelServerMetricsPath", "/metrics", "Path to scrape metrics from pods")
+ modelServerMetricsScheme = flag.String("modelServerMetricsScheme", "http", "Scheme to scrape metrics from pods")
+ modelServerMetricsHttpsInsecureSkipVerify = flag.Bool("modelServerMetricsHttpsInsecureSkipVerify", true, "When using 'https' scheme for 'modelServerMetricsScheme', configure 'InsecureSkipVerify' (default to true)")

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Defaulting HTTPS metrics scraping to InsecureSkipVerify=true weakens security

Insecure by default is risky, even in-cluster. Default should be false; allow opting-in via flag/env.

- modelServerMetricsHttpsInsecureSkipVerify = flag.Bool("modelServerMetricsHttpsInsecureSkipVerify", true, "When using 'https' scheme for 'modelServerMetricsScheme', configure 'InsecureSkipVerify' (default to true)")
+ modelServerMetricsHttpsInsecureSkipVerify = flag.Bool("modelServerMetricsHttpsInsecureSkipVerify", false, "When using 'https' scheme for 'modelServerMetricsScheme', configure 'InsecureSkipVerify' (default false)")
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
+ modelServerMetricsPath = flag.String("modelServerMetricsPath", "/metrics", "Path to scrape metrics from pods")
+ modelServerMetricsScheme = flag.String("modelServerMetricsScheme", "http", "Scheme to scrape metrics from pods")
+ modelServerMetricsHttpsInsecureSkipVerify = flag.Bool("modelServerMetricsHttpsInsecureSkipVerify", true, "When using 'https' scheme for 'modelServerMetricsScheme', configure 'InsecureSkipVerify' (default to true)")
modelServerMetricsPath = flag.String("modelServerMetricsPath", "/metrics", "Path to scrape metrics from pods")
modelServerMetricsScheme = flag.String("modelServerMetricsScheme", "http", "Scheme to scrape metrics from pods")
modelServerMetricsHttpsInsecureSkipVerify = flag.Bool("modelServerMetricsHttpsInsecureSkipVerify", false, "When using 'https' scheme for 'modelServerMetricsScheme', configure 'InsecureSkipVerify' (default false)")
🤖 Prompt for AI Agents
In deploy/inference-gateway/epp-patches/epp-v0.5.1-2/epp-v0.5.1-dyn2.patch
around lines 100 to 103, the flag modelServerMetricsHttpsInsecureSkipVerify is
defaulting to true which weakens security; change its default to false and keep
the flag so operators can opt-in explicitly, and update the flag description to
state that it defaults to false and should only be used when necessary, and
ensure any documentation or env-var mapping reflects the new default.

Comment on lines +134 to +146
+ var metricsHttpClient *http.Client
+ if *modelServerMetricsScheme == "https" {
+ metricsHttpClient = &http.Client{
+ Transport: &http.Transport{
+ TLSClientConfig: &tls.Config{
+ InsecureSkipVerify: *modelServerMetricsHttpsInsecureSkipVerify,
+ },
+ },
+ }
+ } else {
+ metricsHttpClient = http.DefaultClient
+ }
+
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

No HTTP client timeouts for metrics scraping

Using a client without timeouts risks hangs under network issues.

Add a sane Timeout (and optionally dial/read timeouts):

+    "time"
@@
-        metricsHttpClient = &http.Client{
+        metricsHttpClient = &http.Client{
+            Timeout: 5 * time.Second,
             Transport: &http.Transport{
                 TLSClientConfig: &tls.Config{
                     InsecureSkipVerify: *modelServerMetricsHttpsInsecureSkipVerify,
                 },
             },
         }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
+ var metricsHttpClient *http.Client
+ if *modelServerMetricsScheme == "https" {
+ metricsHttpClient = &http.Client{
+ Transport: &http.Transport{
+ TLSClientConfig: &tls.Config{
+ InsecureSkipVerify: *modelServerMetricsHttpsInsecureSkipVerify,
+ },
+ },
+ }
+ } else {
+ metricsHttpClient = http.DefaultClient
+ }
+
// … elsewhere in the file, in your import block:
import (
// …
"time"
)
// …
var metricsHttpClient *http.Client
if *modelServerMetricsScheme == "https" {
metricsHttpClient = &http.Client{
Timeout: 5 * time.Second,
Transport: &http.Transport{
TLSClientConfig: &tls.Config{
InsecureSkipVerify: *modelServerMetricsHttpsInsecureSkipVerify,
},
},
}
} else {
metricsHttpClient = http.DefaultClient
}
🤖 Prompt for AI Agents
In deploy/inference-gateway/epp-patches/epp-v0.5.1-2/epp-v0.5.1-dyn2.patch
around lines 134 to 146, the metricsHttpClient is created without any timeouts
which can cause the process to hang on network issues; update the client
creation to set a sensible Timeout on the http.Client (e.g., a few seconds) and
configure Transport timeouts by supplying a net.Dialer with a Dial/Context
timeout and setting TLSHandshakeTimeout, ResponseHeaderTimeout, and
IdleConn/ExpectContinue timeouts on http.Transport so scraping requests fail
fast and resources are reclaimed promptly.

Comment on lines +367 to +383
+ - --v
+ - "{{ .Values.inferenceExtension.logVerbosity | default "3" }}"
+ - --grpcPort
- "9002"
- -grpcHealthPort
- "9003"
- -metricsPort
- "9090"
+ - -configFile
+ - "config/{{ .Values.inferenceExtension.pluginsConfigFile }}"
# https://pkg.go.dev/flag#hdr-Command_line_flag_syntax; space is only for non-bool flags
- - "-enablePprof={{ .Values.inferenceExtension.enablePprof }}"
+ - "--enablePprof={{ .Values.inferenceExtension.enablePprof }}"
+ - "--modelServerMetricsPath={{ .Values.inferenceExtension.modelServerMetricsPath }}"
+ - "--modelServerMetricsScheme={{ .Values.inferenceExtension.modelServerMetricsScheme }}"
+ - "--modelServerMetricsHttpsInsecureSkipVerify={{ .Values.inferenceExtension.modelServerMetricsHttpsInsecureSkipVerify }}"
{{- if eq (.Values.inferencePool.modelServerType | default "vllm") "triton-tensorrt-llm" }}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Go stdlib flag package doesn’t accept “--long” style; use single dash

Args use “--v”, “--enablePprof”, etc. The code uses the standard flag package, which expects single dash (“-v”, “-enablePprof=...”).

Update chart args back to single dash:

-        - --v
+        - -v
-        - "--enablePprof={{ .Values.inferenceExtension.enablePprof }}"
+        - "-enablePprof={{ .Values.inferenceExtension.enablePprof }}"
-        - "--modelServerMetricsPath={{ .Values.inferenceExtension.modelServerMetricsPath }}"
+        - "-modelServerMetricsPath={{ .Values.inferenceExtension.modelServerMetricsPath }}"
-        - "--modelServerMetricsScheme={{ .Values.inferenceExtension.modelServerMetricsScheme }}"
+        - "-modelServerMetricsScheme={{ .Values.inferenceExtension.modelServerMetricsScheme }}"
-        - "--modelServerMetricsHttpsInsecureSkipVerify={{ .Values.inferenceExtension.modelServerMetricsHttpsInsecureSkipVerify }}"
+        - "-modelServerMetricsHttpsInsecureSkipVerify={{ .Values.inferenceExtension.modelServerMetricsHttpsInsecureSkipVerify }}"
🤖 Prompt for AI Agents
deploy/inference-gateway/epp-patches/epp-v0.5.1-2/epp-v0.5.1-dyn2.patch around
lines 367–383: the Helm chart args use GNU-style double-dash flags which the Go
stdlib flag package does not accept; change all double-dash flags back to
single-dash form (e.g. "--v" -> "-v", "--enablePprof=..." -> "-enablePprof=...",
"--configFile" -> "-configFile", and the three modelServerMetrics flags ->
"-modelServerMetricsPath=...", "-modelServerMetricsScheme=...",
"-modelServerMetricsHttpsInsecureSkipVerify=...") and ensure non-bool flags
supply their values using the "=" form (no space) so the Go flag parser reads
them correctly.

Comment on lines +821 to +826
+ ModelServerMetricsPort int32
+ ModelServerMetricsPath string
+ ModelServerMetricsScheme string
+
+ Client *http.Client
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

Nil client risk in PodMetricsClientImpl and missing default scheme

If Client is nil, p.Client.Do will panic. Also, empty scheme yields malformed URLs.

 type PodMetricsClientImpl struct {
@@
-    Client *http.Client
+    Client *http.Client
 }
@@
- resp, err := p.Client.Do(req)
+ client := p.Client
+ if client == nil {
+     client = http.DefaultClient
+ }
+ resp, err := client.Do(req)
@@
- return fmt.Sprintf("%s://%s:%d%s", p.ModelServerMetricsScheme, pod.Address, p.ModelServerMetricsPort, p.ModelServerMetricsPath)
+ scheme := p.ModelServerMetricsScheme
+ if scheme == "" {
+     scheme = "http"
+ }
+ return fmt.Sprintf("%s://%s:%d%s", scheme, pod.Address, p.ModelServerMetricsPort, p.ModelServerMetricsPath)

Search other initializations to ensure Client is always set.

Also applies to: 834-835


🏁 Script executed:

#!/bin/bash
rg -nP 'PodMetricsClientImpl\s*{[^}]*}' -C2

Length of output: 634


Enforce non-nil HTTP client and default scheme in PodMetricsClientImpl

  • Use http.DefaultClient when p.Client is nil to avoid panics in PodMetricsClientImpl.Do
  • Default empty ModelServerMetricsScheme to "http" to prevent malformed URLs
@@ PodMetricsClientImpl.Do(req)
-   resp, err := p.Client.Do(req)
+   client := p.Client
+   if client == nil {
+       client = http.DefaultClient
+   }
+   resp, err := client.Do(req)
@@ URL formatting in PodMetricsClientImpl
-   return fmt.Sprintf("%s://%s:%d%s", p.ModelServerMetricsScheme, pod.Address, p.ModelServerMetricsPort, p.ModelServerMetricsPath)
+   scheme := p.ModelServerMetricsScheme
+   if scheme == "" {
+       scheme = "http"
+   }
+   return fmt.Sprintf("%s://%s:%d%s", scheme, pod.Address, p.ModelServerMetricsPort, p.ModelServerMetricsPath)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
+ ModelServerMetricsPort int32
+ ModelServerMetricsPath string
+ ModelServerMetricsScheme string
+
+ Client *http.Client
}
// In PodMetricsClientImpl.Do:
func (p *PodMetricsClientImpl) Do(req *http.Request) (*http.Response, error) {
// enforce non-nil HTTP client
client := p.Client
if client == nil {
client = http.DefaultClient
}
resp, err := client.Do(req)
return resp, err
}
// Wherever you format the metrics URL (e.g. in PodMetricsClientImpl):
func (p *PodMetricsClientImpl) buildMetricsURL(pod Pod) string {
// default empty scheme to "http"
scheme := p.ModelServerMetricsScheme
if scheme == "" {
scheme = "http"
}
return fmt.Sprintf(
"%s://%s:%d%s",
scheme,
pod.Address,
p.ModelServerMetricsPort,
p.ModelServerMetricsPath,
)
}

Comment on lines +1000 to +1013
return &MaxScorePicker{
typedName: plugins.TypedName{Type: MaxScorePickerType, Name: MaxScorePickerType},
maxNumOfEndpoints: maxNumOfEndpoints,
+ randomGenerator: rand.New(rand.NewSource(time.Now().UnixNano())),
}
}

// MaxScorePicker picks pod(s) with the maximum score from the list of candidates.
type MaxScorePicker struct {
typedName plugins.TypedName
- maxNumOfEndpoints int // maximum number of endpoints to pick
+ maxNumOfEndpoints int // maximum number of endpoints to pick
+ randomGenerator *rand.Rand // randomGenerator for randomly pick endpoint on tie-break
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

rand.Rand is not concurrency-safe; Pickers may be called in parallel

Replacing rand.Shuffle (package-level, locked) with p.randomGenerator.Shuffle introduces data races if Pick is invoked concurrently.

Use the package-level RNG or guard with a mutex:

- p.randomGenerator.Shuffle(len(scoredPods), func(i, j int) {
+ rand.Shuffle(len(scoredPods), func(i, j int) {
     scoredPods[i], scoredPods[j] = scoredPods[j], scoredPods[i]
 })

Do similarly in RandomPicker.

Also applies to: 1124-1129

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant