Skip to content

Conversation

@JeffLuoo
Copy link
Contributor

@JeffLuoo JeffLuoo commented Oct 29, 2025

The motivation is
llm-d/llm-d-inference-scheduler#386 that we would like to have more metrics from llm-d inference scheduler plugin.

The current metrics implementation is bundled in the runner of EPP, so this PR extends the runner to allow extension to register prometheus metrics collector.

See llm-d/llm-d-inference-scheduler#405 for a draft PR of how metrics are going to be implemented in the plugin.

I have tested locally that both EPP and inference scheduler metrics can be exported through the pod /metrics endpoint.

What type of PR is this?

What this PR does / why we need it:

llm-d/llm-d-inference-scheduler#386

Which issue(s) this PR fixes:

Fixes #

Does this PR introduce a user-facing change?:

Inference Gateway now allows extension to register prometheus metrics collector.

The motivation is
llm-d/llm-d-inference-scheduler#386 that we
would like to have more metrics from llm-d inference scheduler plugin.

The current metrics implementation is bundled in the runner of EPP, so
this PR extends the runner to allow extention to register prometheus
metrics collector.

See llm-d/llm-d-inference-scheduler#405 for a
draft PR of how metrics are going to be implemented in the plugin.

I have tested locally that both EPP and inference scheduler metrics can
be exported through the pod /metrics endpoint.
@netlify
Copy link

netlify bot commented Oct 29, 2025

Deploy Preview for gateway-api-inference-extension ready!

Name Link
🔨 Latest commit 35ac767
🔍 Latest deploy log https://app.netlify.com/projects/gateway-api-inference-extension/deploys/6902749e8b61f60008129ff3
😎 Deploy Preview https://deploy-preview-1787--gateway-api-inference-extension.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Oct 29, 2025
@k8s-ci-robot k8s-ci-robot added the size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. label Oct 29, 2025
@JeffLuoo
Copy link
Contributor Author

cc: @nirrozenbaum @elevran

@kfswain
Copy link
Collaborator

kfswain commented Oct 29, 2025

/lgtm
/approve

@k8s-ci-robot k8s-ci-robot added lgtm "Looks good to me", indicates that a PR is ready to be merged. approved Indicates a PR has been approved by an approver from all required OWNERS files. labels Oct 29, 2025
@nirrozenbaum
Copy link
Contributor

/lgtm
/approve

Thanks Jeff!

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: JeffLuoo, kfswain, nirrozenbaum

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:
  • OWNERS [kfswain,nirrozenbaum]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot merged commit c12dd26 into kubernetes-sigs:main Oct 29, 2025
11 checks passed
@JeffLuoo
Copy link
Contributor Author

I want to make it to llm-d 0.4 release. @kfswain Hi Kellen, are you going to cut a branch for llm-d 0.4 or it will use 1.1.0? If 1.1.0 will be used, I would like this PR to be cherry-picked. Thanks!

@JeffLuoo JeffLuoo changed the title [metrics]: Allow EPP to register metrics from extention [metrics]: Allow EPP to register metrics from extension Oct 30, 2025
@kfswain kfswain mentioned this pull request Nov 5, 2025
1 task
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. size/XS Denotes a PR that changes 0-9 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants