Skip to content

feat(website): Add LLMO optimization with JSON-LD and llms.txt#1236

Merged
yamadashy merged 5 commits intomainfrom
worktree-jazzy-baking-sloth
Mar 18, 2026
Merged

feat(website): Add LLMO optimization with JSON-LD and llms.txt#1236
yamadashy merged 5 commits intomainfrom
worktree-jazzy-baking-sloth

Conversation

@yamadashy
Copy link
Copy Markdown
Owner

@yamadashy yamadashy commented Mar 17, 2026

Add LLM Optimization (LLMO) to the Repomix website to improve visibility in AI-generated search results and LLM citations.

Changes

  1. JSON-LD Structured Data — Added WebSite and SoftwareApplication schema.org markup to all pages, including project metadata, features, author info, and related links
  2. llms.txt / llms-full.txt generation — Added vitepress-plugin-llms to automatically generate LLM-friendly documentation during build (workDir: 'en', domain: 'https://repomix.com')

Checklist

  • Run npm run test
  • Run npm run lint

Open with Devin

yamadashy and others added 2 commits March 15, 2026 00:52
Add WebSite and SoftwareApplication schema.org structured data to
improve visibility in AI-generated search results and LLM citations.
This includes project metadata, features, author info, and related links.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add vitepress-plugin-llms to automatically generate llms.txt and
llms-full.txt during build. Configured with workDir: 'en' to generate
from English docs only, and domain set to https://repomix.com for
fully qualified URLs.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Mar 17, 2026

Important

Review skipped

Auto incremental reviews are disabled on this repository.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 1495b744-8b07-49fe-8513-ebb55f45a7c2

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

This PR integrates the vitepress-plugin-llms (llmstxt) plugin into the VitePress configuration with English language support and domain specification, along with JSON-LD structured data injection into the document head.

Changes

Cohort / File(s) Summary
Plugin Integration & Dependencies
website/client/.vitepress/config/configShard.ts, website/client/package.json
Added vitepress-plugin-llms dependency and integrated the llmstxt plugin into VitePress config with workDir and domain options; added JSON-LD structured data injection into head.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~12 minutes

Possibly related PRs

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The PR title clearly and concisely summarizes the main change—adding LLMO (LLM Optimization) with JSON-LD structured data and llms.txt generation to the website.
Description check ✅ Passed The PR description includes comprehensive details about the changes (JSON-LD structured data and llms.txt generation) and incorporates the required checklist template items.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch worktree-jazzy-baking-sloth
📝 Coding Plan
  • Generate coding plan for human review comments

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@gemini-code-assist
Copy link
Copy Markdown
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly improves the Repomix website's search engine optimization for Large Language Models (LLMO). It achieves this by embedding structured data using JSON-LD across all pages, providing rich context about the website and its software application. Additionally, it introduces an automated process to generate LLM-friendly documentation files, making the project's information more accessible and citable by AI systems.

Highlights

  • JSON-LD Structured Data: Added WebSite and SoftwareApplication schema.org markup to all pages, including project metadata, features, author info, and related links, to improve visibility in AI-generated search results.
  • LLM-friendly documentation generation: Integrated vitepress-plugin-llms to automatically generate llms.txt and llms-full.txt during the build process, enhancing LLM citations and discoverability.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog
  • website/client/.vitepress/config/configShard.ts
    • Imported the llmstxt plugin from vitepress-plugin-llms.
    • Defined a jsonLd constant containing WebSite and SoftwareApplication schema.org structured data.
    • Added a script tag to the head section to inject the jsonLd structured data into the page.
    • Integrated the llmstxt plugin into the VitePress build configuration.
  • website/client/package-lock.json
    • Added vitepress-plugin-llms and its associated transitive dependencies.
    • Updated various existing dependency versions.
  • website/client/package.json
    • Added vitepress-plugin-llms as a new development dependency.
Activity
  • No specific activity has been recorded for this pull request yet.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@cloudflare-workers-and-pages
Copy link
Copy Markdown

cloudflare-workers-and-pages bot commented Mar 17, 2026

Deploying repomix with  Cloudflare Pages  Cloudflare Pages

Latest commit: 5402898
Status: ✅  Deploy successful!
Preview URL: https://41e6fe5d.repomix.pages.dev
Branch Preview URL: https://worktree-jazzy-baking-sloth.repomix.pages.dev

View logs

Copy link
Copy Markdown
Contributor

@devin-ai-integration devin-ai-integration bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Devin Review: No Issues Found

Devin Review analyzed this PR and found no potential bugs to report.

View in Devin Review to see 2 additional findings.

Open in Devin Review

@codecov
Copy link
Copy Markdown

codecov bot commented Mar 17, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 87.14%. Comparing base (008e28c) to head (5402898).
⚠️ Report is 10 commits behind head on main.

Additional details and impacted files
@@           Coverage Diff           @@
##             main    #1236   +/-   ##
=======================================
  Coverage   87.14%   87.14%           
=======================================
  Files         115      115           
  Lines        4310     4310           
  Branches      998      998           
=======================================
  Hits         3756     3756           
  Misses        554      554           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces LLM Optimization (LLMO) by adding JSON-LD structured data for better search engine visibility and integrating vitepress-plugin-llms to generate llms.txt. The changes are well-implemented. I've provided one suggestion to improve maintainability by extracting duplicated hardcoded values into constants in the VitePress configuration.

Comment on lines +21 to +69
// JSON-LD Structured Data
const jsonLd = {
'@context': 'https://schema.org',
'@graph': [
{
'@type': 'WebSite',
name: 'Repomix',
url: 'https://repomix.com',
description: 'Pack your codebase into AI-friendly formats',
},
{
'@type': 'SoftwareApplication',
name: 'Repomix',
description:
'A tool that packs your entire repository into a single, AI-friendly file for use with Large Language Models (LLMs) like ChatGPT, Claude, Gemini, and more.',
url: 'https://repomix.com',
applicationCategory: 'DeveloperApplication',
operatingSystem: 'Windows, macOS, Linux',
offers: {
'@type': 'Offer',
price: '0',
priceCurrency: 'USD',
},
license: 'https://opensource.org/licenses/MIT',
isAccessibleForFree: true,
installUrl: 'https://www.npmjs.com/package/repomix',
downloadUrl: 'https://www.npmjs.com/package/repomix',
softwareRequirements: 'Node.js 20.0.0 or higher',
image: 'https://repomix.com/images/repomix-logo.svg',
screenshot: 'https://repomix.com/images/og-image-large.png',
author: {
'@type': 'Person',
name: 'Kazuki Yamada',
url: 'https://github.com/yamadashy',
},
sameAs: ['https://github.com/yamadashy/repomix', 'https://www.npmjs.com/package/repomix'],
featureList: [
'AI-optimized output formats (XML, Markdown, JSON, Plain Text)',
'Token counting for LLM context limits',
'Git-aware file processing',
'Security-focused with Secretlint integration',
'Remote repository processing',
'MCP Server integration',
'Code compression with Tree-sitter',
'Custom instructions support',
],
},
],
};
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

There are several hardcoded and duplicated values in the jsonLd object and throughout this file. To improve maintainability and ensure consistency, it's a good practice to define these values as constants and reuse them. This will make future updates easier and less error-prone.

For example, values like the site name, URL, GitHub repository, and author information are used in multiple places (jsonLd, manifest, themeConfig, llmstxt plugin).

I suggest extracting these into constants. You can then use these constants in other parts of the file as well to reduce duplication.

// Site metadata
const siteName = 'Repomix';
const siteUrl = 'https://repomix.com';
const siteDescription = 'Pack your codebase into AI-friendly formats';
const longDescription = 'A tool that packs your entire repository into a single, AI-friendly file for use with Large Language Models (LLMs) like ChatGPT, Claude, Gemini, and more.';
const githubUrl = 'https://github.com/yamadashy/repomix';
const npmUrl = 'https://www.npmjs.com/package/repomix';
const authorName = 'Kazuki Yamada';
const authorUrl = 'https://github.com/yamadashy';

// JSON-LD Structured Data
const jsonLd = {
  '@context': 'https://schema.org',
  '@graph': [
    {
      '@type': 'WebSite',
      name: siteName,
      url: siteUrl,
      description: siteDescription,
    },
    {
      '@type': 'SoftwareApplication',
      name: siteName,
      description: longDescription,
      url: siteUrl,
      applicationCategory: 'DeveloperApplication',
      operatingSystem: 'Windows, macOS, Linux',
      offers: {
        '@type': 'Offer',
        price: '0',
        priceCurrency: 'USD',
      },
      license: 'https://opensource.org/licenses/MIT',
      isAccessibleForFree: true,
      installUrl: npmUrl,
      downloadUrl: npmUrl,
      softwareRequirements: 'Node.js 20.0.0 or higher',
      image: `${siteUrl}/images/repomix-logo.svg`,
      screenshot: `${siteUrl}/images/og-image-large.png`,
      author: {
        '@type': 'Person',
        name: authorName,
        url: authorUrl,
      },
      sameAs: [githubUrl, npmUrl],
      featureList: [
        'AI-optimized output formats (XML, Markdown, JSON, Plain Text)',
        'Token counting for LLM context limits',
        'Git-aware file processing',
        'Security-focused with Secretlint integration',
        'Remote repository processing',
        'MCP Server integration',
        'Code compression with Tree-sitter',
        'Custom instructions support',
      ],
    },
  ],
};

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
website/client/.vitepress/config/configShard.ts (1)

21-69: Centralize shared site metadata to reduce SEO/config drift.

name, url, and short description are duplicated across JSON-LD, manifest, and head tags; extracting constants here will make future updates safer.

♻️ Proposed refactor
+const SITE = {
+  name: 'Repomix',
+  url: 'https://repomix.com',
+  shortDescription: 'Pack your codebase into AI-friendly formats',
+} as const;
+
 // JSON-LD Structured Data
 const jsonLd = {
   '@context': 'https://schema.org',
   '@graph': [
     {
       '@type': 'WebSite',
-      name: 'Repomix',
-      url: 'https://repomix.com',
-      description: 'Pack your codebase into AI-friendly formats',
+      name: SITE.name,
+      url: SITE.url,
+      description: SITE.shortDescription,
     },
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@website/client/.vitepress/config/configShard.ts` around lines 21 - 69,
Extract duplicated site metadata into shared constants and replace inline
literals in the jsonLd object: create constants (e.g., SITE_NAME, SITE_URL,
SITE_DESCRIPTION) near the top of configShard.ts and use those constants instead
of hard-coded strings for name, url, and description inside the jsonLd object
(and update any other places in this module that replicate these values such as
manifest/head generation). Ensure jsonLd remains unchanged structurally—only
replace repeated string literals with the new constants so future updates use
the centralized values.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@website/client/.vitepress/config/configShard.ts`:
- Around line 21-69: Extract duplicated site metadata into shared constants and
replace inline literals in the jsonLd object: create constants (e.g., SITE_NAME,
SITE_URL, SITE_DESCRIPTION) near the top of configShard.ts and use those
constants instead of hard-coded strings for name, url, and description inside
the jsonLd object (and update any other places in this module that replicate
these values such as manifest/head generation). Ensure jsonLd remains unchanged
structurally—only replace repeated string literals with the new constants so
future updates use the centralized values.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 8f0bc6c0-4b09-40cb-a70e-9d0e4178cea9

📥 Commits

Reviewing files that changed from the base of the PR and between 008e28c and 206e271.

⛔ Files ignored due to path filters (1)
  • website/client/package-lock.json is excluded by !**/package-lock.json
📒 Files selected for processing (2)
  • website/client/.vitepress/config/configShard.ts
  • website/client/package.json

Extract duplicated values (siteName, siteUrl, siteDescription,
ogImageUrl, githubUrl, npmUrl) into constants and reuse them across
JSON-LD, OGP, manifest, sitemap, and plugin configurations.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@claude
Copy link
Copy Markdown
Contributor

claude bot commented Mar 18, 2026

Code Review

Overall this is a clean, well-structured PR. The refactoring to extract shared metadata constants is a nice improvement that was also suggested by other reviewers — glad to see it already addressed in the latest commit. A few observations:

Bug: twitter:domain value

The twitter:domain meta property should contain just the domain name (e.g., repomix.com), not a full URL with protocol. Currently it's set to siteUrl which is https://repomix.com.

// Line 177 — current
['meta', { property: 'twitter:domain', content: siteUrl }],

// Suggested fix
['meta', { property: 'twitter:domain', content: 'repomix.com' }],

Note: this bug existed before this PR (the original had 'https://repomix.com'), but since this PR is touching these meta tags and centralizing values, it's a good opportunity to fix it. Could add a siteDomain constant alongside the others.

Other observations (minor)

JSON-LD structured data looks good. The @graph approach with both WebSite and SoftwareApplication types is well-structured and schema.org-compliant. The feature list is accurate and comprehensive.

vitepress-plugin-llms integration is straightforward. Using workDir: 'en' to scope to English docs and spreading the plugin result is correct per the plugin API.

Constants extraction (latest commit) addresses the duplication concern raised by both CodeRabbit and Gemini. The shared siteName, siteUrl, siteDescription, ogImageUrl, githubUrl, and npmUrl constants are used consistently throughout the config.

No test coverage impact — this is purely website configuration, so no unit test changes are expected. The Codecov report confirms coverage is unchanged.

Premortem / Risk Assessment

  • Low risk: Changes are limited to VitePress config and a new dev dependency — no impact on the core Repomix tool
  • Build verification: The llmstxt plugin runs at build time. Worth confirming the website build succeeds and that llms.txt / llms-full.txt are generated correctly in the output
  • No i18n content impact: workDir: 'en' correctly scopes LLM doc generation to English only

LGTM with the minor twitter:domain fix.


Reviewed by Claude

devin-ai-integration[bot]

This comment was marked as resolved.

Change twitter:domain from full URL (https://repomix.com) to bare
domain (repomix.com) as required by the Twitter Card specification.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@claude
Copy link
Copy Markdown
Contributor

claude bot commented Mar 18, 2026

Code Review (Follow-up)

Previous feedback has been addressed in the latest commits — nice work:

  • 4cc5bdc — Extracted shared site metadata into constants (siteName, siteUrl, siteDescription, ogImageUrl, githubUrl, npmUrl), addressing duplication concerns raised by multiple reviewers
  • 5402898 — Fixed twitter:domain to use bare domain repomix.com instead of the full URL

Current State

No new issues found. The PR is clean and ready to merge.

Review details

JSON-LD structured data: Well-structured @graph with WebSite and SoftwareApplication types. Schema.org-compliant, feature list is accurate.

vitepress-plugin-llms integration: Correctly configured with workDir: 'en' for English-only doc generation and proper domain setting. The spread syntax for plugin result matches the plugin API.

Constants refactoring: All hardcoded values are now centralized and used consistently across JSON-LD, OGP meta tags, PWA manifest, sitemap hostname, social links, and plugin config.

Risk assessment: Low — changes are confined to VitePress config and a dev dependency. No impact on core Repomix functionality. Build-time plugin should be verified in CI.

LGTM ✅


Reviewed by Claude

@yamadashy yamadashy merged commit 76d4b17 into main Mar 18, 2026
58 checks passed
@yamadashy yamadashy deleted the worktree-jazzy-baking-sloth branch March 18, 2026 14:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants