From e67749afdc3249d9bf023301dd3e1ce5ab277597 Mon Sep 17 00:00:00 2001
From: Florian Woerner <florian.woerner@onmyown.io>
Date: Tue, 12 May 2026 10:41:32 +0200
Subject: [PATCH 1/2] Fix typo in llm-d documentation link

Signed-off-by: Florian Woerner <florian.woerner@onmyown.io>
---
 docs/deployment/integrations/llm-d.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/docs/deployment/integrations/llm-d.md b/docs/deployment/integrations/llm-d.md
index cccf1773c6be..276fd2435cb8 100644
--- a/docs/deployment/integrations/llm-d.md
+++ b/docs/deployment/integrations/llm-d.md
@@ -2,4 +2,4 @@
 
 vLLM can be deployed with [llm-d](https://github.com/llm-d/llm-d), a Kubernetes-native distributed inference serving stack providing well-lit paths for anyone to serve large generative AI models at scale. It helps achieve the fastest "time to state-of-the-art (SOTA) performance" for key OSS models across most hardware accelerators and infrastructure providers.
 
-You can use vLLM with llm-d directly by following [this guide](https://llm-d.ai/docs/guide) or via [KServe's LLMInferenceService](https://kserve.github.io/website/docs/model-serving/generative-inference/llmisvc/llmisvc-overview).
+You can use vLLM with llm-d directly by following [this guides](https://llm-d.ai/docs/guides) or via [KServe's LLMInferenceService](https://kserve.github.io/website/docs/model-serving/generative-inference/llmisvc/llmisvc-overview).

From d56695bdfdc1444a72b85c4887bc98ec2b197306 Mon Sep 17 00:00:00 2001
From: Florian Woerner <florian.woerner@onmyown.io>
Date: Tue, 12 May 2026 11:31:47 +0200
Subject: [PATCH 2/2] Update docs/deployment/integrations/llm-d.md

Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
Signed-off-by: Florian Woerner <florian.woerner@onmyown.io>
---
 docs/deployment/integrations/llm-d.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/docs/deployment/integrations/llm-d.md b/docs/deployment/integrations/llm-d.md
index 276fd2435cb8..6060b98f6421 100644
--- a/docs/deployment/integrations/llm-d.md
+++ b/docs/deployment/integrations/llm-d.md
@@ -2,4 +2,4 @@
 
 vLLM can be deployed with [llm-d](https://github.com/llm-d/llm-d), a Kubernetes-native distributed inference serving stack providing well-lit paths for anyone to serve large generative AI models at scale. It helps achieve the fastest "time to state-of-the-art (SOTA) performance" for key OSS models across most hardware accelerators and infrastructure providers.
 
-You can use vLLM with llm-d directly by following [this guides](https://llm-d.ai/docs/guides) or via [KServe's LLMInferenceService](https://kserve.github.io/website/docs/model-serving/generative-inference/llmisvc/llmisvc-overview).
+You can use vLLM with llm-d directly by following [the official guides](https://llm-d.ai/docs/guides) or via [KServe's LLMInferenceService](https://kserve.github.io/website/docs/model-serving/generative-inference/llmisvc/llmisvc-overview).