Mirror: feat(docs): add dynamic sitemap.xml generation (#5728)#8
Conversation
- Add API endpoint at /api/sitemap.xml that dynamically generates sitemap - Add rewrite rule to serve sitemap at /sitemap.xml - Include all markdown pages with lastmod dates and priority - Add comprehensive tests for sitemap generation
Co-authored-by: kiloconnect[bot] <240665456+kiloconnect[bot]@users.noreply.github.com>
Co-authored-by: kiloconnect[bot] <240665456+kiloconnect[bot]@users.noreply.github.com>
|
Warning Rate limit exceeded
⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ✨ Finishing touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Summary of ChangesHello @jeremylongshore, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request introduces a significant enhancement to the documentation site by enabling dynamic sitemap generation. This change automates the process of keeping the sitemap current with the site's content, which is crucial for search engine optimization and discoverability. The new functionality is robust, includes thorough testing, and integrates seamlessly with the existing Next.js framework. Highlights
Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
|
Failed to generate code suggestions for PR |
There was a problem hiding this comment.
Code Review
This pull request introduces dynamic sitemap generation, which is a great addition for SEO. The implementation is solid, with a new API endpoint to generate the sitemap from markdown files and a comprehensive test suite to ensure its correctness.
My review includes a few suggestions to improve maintainability and completeness:
- Using an environment variable for the site URL.
- Ensuring the homepage entry in the sitemap includes its last modification date.
- A minor code simplification in the file scanning logic.
- Enhancing the tests to cover the homepage's modification date.
Overall, this is a well-executed feature. Addressing these points will make it even more robust.
| it("should include homepage with priority 1.0", async () => { | ||
| await handler(mockReq as NextApiRequest, mockRes as NextApiResponse) | ||
|
|
||
| expect(responseData).toContain("<loc>https://kilo.ai/docs</loc>") | ||
| expect(responseData).toContain("<priority>1.0</priority>") | ||
| }) |
There was a problem hiding this comment.
This test correctly verifies the homepage's loc and priority, but it doesn't check for the presence of the <lastmod> tag. To ensure the sitemap is complete and effective for SEO, the test should also assert that the homepage entry includes a lastmod date.
| it("should include homepage with priority 1.0", async () => { | |
| await handler(mockReq as NextApiRequest, mockRes as NextApiResponse) | |
| expect(responseData).toContain("<loc>https://kilo.ai/docs</loc>") | |
| expect(responseData).toContain("<priority>1.0</priority>") | |
| }) | |
| it("should include homepage with priority 1.0 and lastmod date", async () => { | |
| await handler(mockReq as NextApiRequest, mockRes as NextApiResponse) | |
| expect(responseData).toContain("<loc>https://kilo.ai/docs</loc>") | |
| expect(responseData).toContain("<priority>1.0</priority>") | |
| expect(responseData).toContain("<lastmod>2025-01-15</lastmod>") | |
| }) |
| import fs from "fs" | ||
| import path from "path" | ||
|
|
||
| const SITE_URL = "https://kilo.ai/docs" |
There was a problem hiding this comment.
The SITE_URL is hardcoded. This can make it difficult to run the sitemap generator in different environments (e.g., staging, development) with different base URLs. It's a best practice to source this value from an environment variable.
| const SITE_URL = "https://kilo.ai/docs" | |
| const SITE_URL = process.env.NEXT_PUBLIC_SITE_URL || "https://kilo.ai/docs" |
| /** | ||
| * Recursively finds all markdown files in a directory | ||
| */ | ||
| function findMarkdownFiles(dir: string, baseDir: string = dir): string[] { |
There was a problem hiding this comment.
The baseDir parameter in the findMarkdownFiles function is initialized and passed in recursive calls, but its value is never actually used. It can be removed to simplify the function signature. You'll also need to update the recursive call on line 20 to files.push(...findMarkdownFiles(fullPath)).
| function findMarkdownFiles(dir: string, baseDir: string = dir): string[] { | |
| function findMarkdownFiles(dir: string): string[] { |
| urls.push(` <url> | ||
| <loc>${SITE_URL}</loc> | ||
| <changefreq>weekly</changefreq> | ||
| <priority>1.0</priority> | ||
| </url>`) |
There was a problem hiding this comment.
The homepage entry in the sitemap is missing the <lastmod> tag, which is valuable for search engines. Also, for consistency, escapeXml should be used on the <loc> value, just as it is for other URLs.
You can get the modification date from the pages/index.tsx file. The suggestion below does this directly. For more robustness, you might consider adding a check with fs.existsSync before calling getLastModified to avoid potential errors if the file doesn't exist.
| urls.push(` <url> | |
| <loc>${SITE_URL}</loc> | |
| <changefreq>weekly</changefreq> | |
| <priority>1.0</priority> | |
| </url>`) | |
| urls.push(` <url> | |
| <loc>${escapeXml(SITE_URL)}</loc> | |
| <lastmod>${getLastModified(path.join(pagesDir, "index.tsx"))}</lastmod> | |
| <changefreq>weekly</changefreq> | |
| <priority>1.0</priority> | |
| </url>`) |
4231140 to
fa13626
Compare
Review: kilocode Kilo-Org#5728
Checklist
Findings🟡 Duplicate homepage entryFile: The homepage is added as a hardcoded entry (line 78-84), but the root Fix: Either skip root 🟡 Homepage missing
|
| Check | Result |
|---|---|
| Build Markdoc Site | PASS |
| compile | PASS |
| check-translations | PASS |
| unit-test | PASS |
| test-extension (ubuntu) | PASS |
| test-extension (windows) | PASS |
| test-webview (ubuntu) | PASS |
| test-webview (windows) | PASS |
| build-cli | PASS |
| test-cli | PASS |
| test-jetbrains | PASS |
| Vercel | PASS |
Code Snippets
// apps/kilocode-docs/pages/api/sitemap.xml.ts — core logic
function findMarkdownFiles(dir: string, baseDir: string = dir): string[] {
const files: string[] = []
const entries = fs.readdirSync(dir, { withFileTypes: true })
for (const entry of entries) {
if (entry.isDirectory()) {
if (entry.name === "api") continue // Skip api directory
files.push(...findMarkdownFiles(fullPath, baseDir))
} else if (entry.name.endsWith(".md")) {
files.push(fullPath)
}
}
return files
}// apps/kilocode-docs/next.config.js — rewrite rule (matches existing llms.txt pattern)
{
source: "/sitemap.xml",
destination: "/api/sitemap.xml",
},Verdict
COMMENT - Solid implementation with good test coverage (162 lines). The architecture follows the existing llms.txt pattern for API routes with rewrites. Two functional issues should be addressed: duplicate homepage entry (hardcoded + scanned index.md) and missing lastmod on the homepage. The unused baseDir parameter, hardcoded URL, and sync I/O are minor polish items. Security is clean — XML escaping is properly implemented.
Review Journal: kilocode Kilo-Org#5728
SummaryA dynamic sitemap.xml generator for the docs site, built as a Next.js API route with file scanning, XML generation, and a 162-line test suite. Clean architecture that follows the existing First Impressions
What I Looked At
AnalysisArchitecture is soundThe design mirrors the existing This is the right approach — dynamic generation means new pages automatically appear in the sitemap without manual updates. The 1-hour cache ( Duplicate homepage is a real bugThe handler hardcodes a homepage entry: urls.push(`<url><loc>${SITE_URL}</loc>...`)Then Test quality is good162 lines covering:
Mocking approach is clean — Security review
No security concerns. VerificationAll CI checks pass: Bot Review Synthesis
Notable: kiloconnect reviewed its own PR and found real issues. The bot that generated the code also reviewed it and identified the duplicate homepage bug. This is both impressive (self-critique) and concerning (why wasn't the bug fixed before submission?). Lessons Learned1. Bot-generated code needs the same scrutiny as human code. kiloconnect generated clean, well-structured TypeScript with tests — but still had a functional bug (duplicate homepage). The quality bar is the same regardless of author. 2. Existing patterns reduce review burden. The 3. CodeRabbit rate limits are a bottleneck. Third time rate-limited in this session. For a pipeline processing multiple PRs per session, we're hitting CodeRabbit's hourly commit review limit. Consider spacing fork PR creation or upgrading the plan. 4. Test-first verification works. The 162-line test suite gave us confidence in the implementation's correctness. For code-bearing PRs, comprehensive tests reduce the need for manual testing — we can verify the tests cover the right scenarios instead of running the code ourselves. Review #5 of 75 | Review methodology: AI PR Review Case Studies | Reviewed with GWI + Claude Code |
Mirror of Kilo-Org#5728
This PR mirrors the upstream change for multi-AI review analysis.
Changes
Adds dynamic sitemap.xml generation for the docs site via a Next.js API endpoint. Scans markdown files, generates XML with lastmod/priority/changefreq. Includes 162-line test suite and Next.js rewrite config.
Bot Review Checklist