Skip to content

Mirror: feat(docs): add dynamic sitemap.xml generation (#5728)#8

Merged
jeremylongshore merged 8 commits into
mainfrom
review/PR-5728
Feb 15, 2026
Merged

Mirror: feat(docs): add dynamic sitemap.xml generation (#5728)#8
jeremylongshore merged 8 commits into
mainfrom
review/PR-5728

Conversation

@jeremylongshore
Copy link
Copy Markdown
Owner

Mirror of Kilo-Org#5728

Field Value
Upstream PR #5728
Author @kiloconnect (bot) / @olearycrew
Category docs (feature)
Tier 1
Size 279+/0- lines, 3 files

This PR mirrors the upstream change for multi-AI review analysis.

Changes

Adds dynamic sitemap.xml generation for the docs site via a Next.js API endpoint. Scans markdown files, generates XML with lastmod/priority/changefreq. Includes 162-line test suite and Next.js rewrite config.

Bot Review Checklist

  • CodeRabbit reviewed
  • Gemini Code Assist reviewed
  • Greptile reviewed
  • CodeQL scanned
  • Qodo PR-Agent reviewed

jeremylongshore and others added 8 commits February 14, 2026 10:51
- Add API endpoint at /api/sitemap.xml that dynamically generates sitemap
- Add rewrite rule to serve sitemap at /sitemap.xml
- Include all markdown pages with lastmod dates and priority
- Add comprehensive tests for sitemap generation
Co-authored-by: kiloconnect[bot] <240665456+kiloconnect[bot]@users.noreply.github.com>
Co-authored-by: kiloconnect[bot] <240665456+kiloconnect[bot]@users.noreply.github.com>
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Feb 14, 2026

Warning

Rate limit exceeded

@jeremylongshore has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 19 minutes and 11 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

✨ Finishing touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch review/PR-5728

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@gemini-code-assist
Copy link
Copy Markdown

Summary of Changes

Hello @jeremylongshore, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces a significant enhancement to the documentation site by enabling dynamic sitemap generation. This change automates the process of keeping the sitemap current with the site's content, which is crucial for search engine optimization and discoverability. The new functionality is robust, includes thorough testing, and integrates seamlessly with the existing Next.js framework.

Highlights

  • Dynamic Sitemap Generation: Implemented a Next.js API endpoint to dynamically generate sitemap.xml for the documentation site, improving SEO by providing up-to-date site structure to search engines.
  • Markdown File Scanning: The sitemap generator scans markdown files within the pages directory, automatically including them in the sitemap with calculated lastmod dates, changefreq, and priority based on their path depth.
  • Next.js Configuration Update: A rewrite rule was added to next.config.js to route requests for /sitemap.xml directly to the new API endpoint.
  • Comprehensive Test Suite: A new test file (sitemap.spec.ts) was added, featuring 162 lines of tests to ensure the sitemap API correctly generates valid XML, handles different HTTP methods, includes all relevant pages, and manages errors gracefully.
Changelog
  • apps/kilocode-docs/tests/sitemap.spec.ts
    • Added a comprehensive test suite for the sitemap generation API endpoint.
  • apps/kilocode-docs/next.config.js
    • Added a rewrite rule to direct /sitemap.xml requests to the new API endpoint.
  • apps/kilocode-docs/pages/api/sitemap.xml.ts
    • Implemented the Next.js API endpoint responsible for dynamically generating the sitemap.xml file.
Activity
  • This pull request is a mirror of an upstream change from Kilo-Org/kilocode#5728.
  • The PR description indicates a bot review checklist is in progress, including reviews by CodeRabbit, Gemini Code Assist, Greptile, CodeQL, and Qodo PR-Agent.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@github-actions
Copy link
Copy Markdown

Failed to generate code suggestions for PR

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces dynamic sitemap generation, which is a great addition for SEO. The implementation is solid, with a new API endpoint to generate the sitemap from markdown files and a comprehensive test suite to ensure its correctness.

My review includes a few suggestions to improve maintainability and completeness:

  • Using an environment variable for the site URL.
  • Ensuring the homepage entry in the sitemap includes its last modification date.
  • A minor code simplification in the file scanning logic.
  • Enhancing the tests to cover the homepage's modification date.

Overall, this is a well-executed feature. Addressing these points will make it even more robust.

Comment on lines +117 to +122
it("should include homepage with priority 1.0", async () => {
await handler(mockReq as NextApiRequest, mockRes as NextApiResponse)

expect(responseData).toContain("<loc>https://kilo.ai/docs</loc>")
expect(responseData).toContain("<priority>1.0</priority>")
})
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This test correctly verifies the homepage's loc and priority, but it doesn't check for the presence of the <lastmod> tag. To ensure the sitemap is complete and effective for SEO, the test should also assert that the homepage entry includes a lastmod date.

Suggested change
it("should include homepage with priority 1.0", async () => {
await handler(mockReq as NextApiRequest, mockRes as NextApiResponse)
expect(responseData).toContain("<loc>https://kilo.ai/docs</loc>")
expect(responseData).toContain("<priority>1.0</priority>")
})
it("should include homepage with priority 1.0 and lastmod date", async () => {
await handler(mockReq as NextApiRequest, mockRes as NextApiResponse)
expect(responseData).toContain("<loc>https://kilo.ai/docs</loc>")
expect(responseData).toContain("<priority>1.0</priority>")
expect(responseData).toContain("<lastmod>2025-01-15</lastmod>")
})

import fs from "fs"
import path from "path"

const SITE_URL = "https://kilo.ai/docs"
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The SITE_URL is hardcoded. This can make it difficult to run the sitemap generator in different environments (e.g., staging, development) with different base URLs. It's a best practice to source this value from an environment variable.

Suggested change
const SITE_URL = "https://kilo.ai/docs"
const SITE_URL = process.env.NEXT_PUBLIC_SITE_URL || "https://kilo.ai/docs"

/**
* Recursively finds all markdown files in a directory
*/
function findMarkdownFiles(dir: string, baseDir: string = dir): string[] {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The baseDir parameter in the findMarkdownFiles function is initialized and passed in recursive calls, but its value is never actually used. It can be removed to simplify the function signature. You'll also need to update the recursive call on line 20 to files.push(...findMarkdownFiles(fullPath)).

Suggested change
function findMarkdownFiles(dir: string, baseDir: string = dir): string[] {
function findMarkdownFiles(dir: string): string[] {

Comment on lines +77 to +81
urls.push(` <url>
<loc>${SITE_URL}</loc>
<changefreq>weekly</changefreq>
<priority>1.0</priority>
</url>`)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The homepage entry in the sitemap is missing the <lastmod> tag, which is valuable for search engines. Also, for consistency, escapeXml should be used on the <loc> value, just as it is for other URLs.

You can get the modification date from the pages/index.tsx file. The suggestion below does this directly. For more robustness, you might consider adding a check with fs.existsSync before calling getLastModified to avoid potential errors if the file doesn't exist.

Suggested change
urls.push(` <url>
<loc>${SITE_URL}</loc>
<changefreq>weekly</changefreq>
<priority>1.0</priority>
</url>`)
urls.push(` <url>
<loc>${escapeXml(SITE_URL)}</loc>
<lastmod>${getLastModified(path.join(pagesDir, "index.tsx"))}</lastmod>
<changefreq>weekly</changefreq>
<priority>1.0</priority>
</url>`)

@jeremylongshore
Copy link
Copy Markdown
Owner Author

Review: kilocode Kilo-Org#5728

feat(docs): add dynamic sitemap.xml generation by @kiloconnect / @olearycrew
Multi-AI analysis: Fork PR #8 — reviewed by CodeRabbit, Gemini, CodeQL, Qodo

Checklist

Check Result Notes
Correctness ISSUE Duplicate homepage entry; homepage missing lastmod
Conventions PASS Follows existing pattern (llms.txt rewrite in same file)
Changeset SKIP Docs-site infrastructure, no extension version bump needed
Tests PASS 162 lines, good coverage with mocked fs
i18n N/A No UI strings
Types PASS TypeScript, proper Next.js types used
Security PASS XML escaping implemented; no user input in file paths
Scope PASS 3 files, single concern (sitemap generation)

Findings

🟡 Duplicate homepage entry

File: apps/kilocode-docs/pages/api/sitemap.xml.ts (lines 78-84, 86-99)

The homepage is added as a hardcoded entry (line 78-84), but the root index.md file will also be discovered by findMarkdownFiles() and generate a URL path of /https://kilo.ai/docs. This creates two <url> entries for the same location.

Fix: Either skip root index.md in the file scanner, or remove the hardcoded homepage entry and handle index.md with priority 1.0 as a special case.

🟡 Homepage missing <lastmod>

File: apps/kilocode-docs/pages/api/sitemap.xml.ts (lines 78-84)

The hardcoded homepage entry omits <lastmod> while all other entries include it. Search engines use lastmod for crawl scheduling — the most important page should have it.

// Current (missing lastmod):
urls.push(`  <url>
    <loc>${SITE_URL}</loc>
    <changefreq>weekly</changefreq>
    <priority>1.0</priority>
  </url>`)

⚪ Unused baseDir parameter

File: apps/kilocode-docs/pages/api/sitemap.xml.ts (line 11)

findMarkdownFiles(dir: string, baseDir: string = dir) — the baseDir parameter is passed through recursive calls but never read. Can be removed.

⚪ Hardcoded SITE_URL

File: apps/kilocode-docs/pages/api/sitemap.xml.ts (line 5)

const SITE_URL = "https://kilo.ai/docs" — consider using process.env.NEXT_PUBLIC_SITE_URL with this as fallback, for staging/preview environments.

⚪ Synchronous I/O in API route

readdirSync and statSync block the event loop during generation. Mitigated by the 1-hour cache, but readdir/stat (async) would be more idiomatic for a Next.js API route. Low priority given the cache strategy.

CI Status

Check Result
Build Markdoc Site PASS
compile PASS
check-translations PASS
unit-test PASS
test-extension (ubuntu) PASS
test-extension (windows) PASS
test-webview (ubuntu) PASS
test-webview (windows) PASS
build-cli PASS
test-cli PASS
test-jetbrains PASS
Vercel PASS

Code Snippets

// apps/kilocode-docs/pages/api/sitemap.xml.ts — core logic
function findMarkdownFiles(dir: string, baseDir: string = dir): string[] {
    const files: string[] = []
    const entries = fs.readdirSync(dir, { withFileTypes: true })
    for (const entry of entries) {
        if (entry.isDirectory()) {
            if (entry.name === "api") continue  // Skip api directory
            files.push(...findMarkdownFiles(fullPath, baseDir))
        } else if (entry.name.endsWith(".md")) {
            files.push(fullPath)
        }
    }
    return files
}
// apps/kilocode-docs/next.config.js — rewrite rule (matches existing llms.txt pattern)
{
    source: "/sitemap.xml",
    destination: "/api/sitemap.xml",
},

Verdict

COMMENT - Solid implementation with good test coverage (162 lines). The architecture follows the existing llms.txt pattern for API routes with rewrites. Two functional issues should be addressed: duplicate homepage entry (hardcoded + scanned index.md) and missing lastmod on the homepage. The unused baseDir parameter, hardcoded URL, and sync I/O are minor polish items. Security is clean — XML escaping is properly implemented.

@jeremylongshore
Copy link
Copy Markdown
Owner Author

Review Journal: kilocode Kilo-Org#5728

PR: #5728 |
Title: feat(docs): add dynamic sitemap.xml generation |
Author: @kiloconnect (bot) / @olearycrew |
Category: docs (feature) | Tier: 1 | Size: 279 lines, 3 files | Confidence: 4/5

Multi-AI analysis: Fork PR #8 — CodeRabbit, Gemini, CodeQL, Qodo


Summary

A dynamic sitemap.xml generator for the docs site, built as a Next.js API route with file scanning, XML generation, and a 162-line test suite. Clean architecture that follows the existing llms.txt pattern. Two functional issues: duplicate homepage entry and missing lastmod on the homepage. Also the first code-bearing PR in our review pipeline — tests, types, and security all pass.

First Impressions

feat(docs) is an unusual prefix — it signals a feature that lives in the docs infrastructure but involves actual TypeScript code. At 279 lines across 3 files (endpoint, tests, config), this is the most complex PR we've reviewed so far. Generated by kiloconnect from a Slack request by Brendan O'Leary.

What I Looked At

  1. API endpointpages/api/sitemap.xml.ts (112 lines): recursive file scanner, XML builder, cache headers
  2. Test suite__tests__/sitemap.spec.ts (162 lines): mocked fs, HTTP method handling, XML structure validation
  3. Confignext.config.js (5 lines): rewrite rule routing /sitemap.xml → API
  4. Existing patternllms.txt uses the same rewrite pattern in next.config.js
  5. kiloconnect's own review — flagged duplicate homepage (2 warnings, 1 suggestion)
  6. Fork PR Mirror: feat(docs): add dynamic sitemap.xml generation (#5728) #8 — Gemini review (CodeRabbit rate-limited)

Analysis

Architecture is sound

The design mirrors the existing llms.txt endpoint pattern:

/docs/sitemap.xml → rewrite → /api/sitemap.xml → handler() → XML response

This is the right approach — dynamic generation means new pages automatically appear in the sitemap without manual updates. The 1-hour cache (max-age=3600, s-maxage=3600) prevents regeneration on every request.

Duplicate homepage is a real bug

The handler hardcodes a homepage entry:

urls.push(`<url><loc>${SITE_URL}</loc>...`)

Then findMarkdownFiles() discovers pages/index.md, which generates URL path /https://kilo.ai/docs. Result: two identical URLs in the sitemap. Google Search Console will flag this as a warning.

Test quality is good

162 lines covering:

  • HTTP 405 for non-GET methods
  • XML structure (declaration, urlset, closing tags)
  • Homepage presence and priority
  • URL generation from directory structure
  • lastmod dates
  • changefreq inclusion
  • API directory exclusion
  • Error handling (500 on fs errors)

Mocking approach is clean — vi.mock("fs") with readdirSync returning fake directory structures.

Security review

  • XML injection: escapeXml() handles &, <, >, ", ' — covers the XML special characters
  • Path traversal: findMarkdownFiles() starts from process.cwd()/pages — no user input in paths
  • Error handling: catches exceptions, returns 500 with generic message (no stack leak)

No security concerns.

Verification

All CI checks pass:

Build Markdoc Site     PASS    (directly relevant)
compile                PASS    (TypeScript compilation)
check-translations     PASS
unit-test              PASS    (includes new sitemap tests)
test-extension         PASS    (ubuntu + windows)
test-webview           PASS    (ubuntu + windows)
build-cli              PASS
test-cli               PASS
test-jetbrains         PASS
Vercel                 PASS    (preview deployed)

Bot Review Synthesis

Bot Verdict Key Finding Useful?
Gemini Comment 4 findings: missing homepage lastmod, hardcoded URL, unused baseDir param, missing escapeXml on homepage Yes — all valid, 3 overlap manual findings
kiloconnect Comment Duplicate homepage entry, root index.md handling Yes — caught the main bug
CodeRabbit Rate-limited Did not review Rate limits becoming a pattern
Greptile No response 0/5 PRs reviewed Non-functional
CodeQL N/A No findings Expected
Qodo Failed Could not generate suggestions Config issue

Notable: kiloconnect reviewed its own PR and found real issues. The bot that generated the code also reviewed it and identified the duplicate homepage bug. This is both impressive (self-critique) and concerning (why wasn't the bug fixed before submission?).

Lessons Learned

1. Bot-generated code needs the same scrutiny as human code. kiloconnect generated clean, well-structured TypeScript with tests — but still had a functional bug (duplicate homepage). The quality bar is the same regardless of author.

2. Existing patterns reduce review burden. The llms.txt rewrite pattern in next.config.js gave us a reference implementation. When a PR follows an established pattern, we can focus review on the novel parts (file scanning logic, XML generation) rather than architecture.

3. CodeRabbit rate limits are a bottleneck. Third time rate-limited in this session. For a pipeline processing multiple PRs per session, we're hitting CodeRabbit's hourly commit review limit. Consider spacing fork PR creation or upgrading the plan.

4. Test-first verification works. The 162-line test suite gave us confidence in the implementation's correctness. For code-bearing PRs, comprehensive tests reduce the need for manual testing — we can verify the tests cover the right scenarios instead of running the code ourselves.


Review #5 of 75 | Review methodology: AI PR Review Case Studies | Reviewed with GWI + Claude Code

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants