Skip to content

fix(seo): resolve translation status by content, not namespace prefix#18143

Merged
pettinarip merged 1 commit into
devfrom
fix/sitemap-noindex-translation-registry
May 8, 2026
Merged

fix(seo): resolve translation status by content, not namespace prefix#18143
pettinarip merged 1 commit into
devfrom
fix/sitemap-noindex-translation-registry

Conversation

@pettinarip
Copy link
Copy Markdown
Member

@pettinarip pettinarip commented May 7, 2026

Summary

  • Sitemap emitted /<locale>/videos/<slug>/ for every locale, while getMetadata correctly returned noindex,follow whenever page-videos.json was missing — silently noindexing ~1,391 video pages that already had translated markdown content (e.g. https://ethereum.org/ar/videos/decentralized-social-media/).
  • Conversely, ~24 (page,locale) pairs for developers/tutorials/gasless-token (English-only) were marked translated for all 25 locales because page-developers-index.json exists in every locale — Google was getting localized canonicals over English fallback content (duplicate-content signal).
  • Root cause: getPageType classified every slug under a namespace-mapped prefix as "intl" and gated translation on namespace JSON presence, even when the actual translation source for that slug is per-page markdown. Replaced with a content-first resolver: if English markdown exists for the slug, translation status follows the localized markdown; otherwise fall back to UI namespace presence. Folded video slugs into the unified registry so the sitemap loop no longer needs a special case.

Test plan

  • pnpm test:unit — 881/881 pass; added invariant tests under tests/unit/i18n/translation-registry.spec.ts covering content-first resolution, namespace fallback, and slug normalization
  • npx tsc --noEmit — clean
  • After deploy, verify https://ethereum.org/ar/videos/decentralized-social-media/ no longer emits <meta name="robots" content="noindex,follow"> and its canonical points to itself
  • Next Ahrefs crawl: expect "Noindex page in sitemap", "Non-canonical page in sitemap", and "Hreflang to non-canonical" to each drop by ~1,391

@netlify
Copy link
Copy Markdown

netlify Bot commented May 7, 2026

Deploy Preview for ethereumorg ready!

Name Link
🔨 Latest commit d5eb1e6
🔍 Latest deploy log https://app.netlify.com/projects/ethereumorg/deploys/69fc9148b5985600086baa82
😎 Deploy Preview https://deploy-preview-18143.ethereum.it
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.
Lighthouse
Lighthouse
7 paths audited
Performance: 65 (no change from production)
Accessibility: 96 (no change from production)
Best Practices: 100 (no change from production)
SEO: 98 (🔴 down 1 from production)
PWA: 59 (no change from production)
View the detailed breakdown and full score reports

To edit notification comments on pull requests, go to your Netlify project configuration.

@github-actions github-actions Bot added the tooling 🔧 Changes related to tooling of the project label May 7, 2026
@pettinarip
Copy link
Copy Markdown
Member Author

pettinarip commented May 8, 2026

Edge-case verification (preview vs prod)

Case Preview (this PR) Prod (current) Result
AR /videos/decentralized-social-media/ (translated md exists) canonical → self, hreflang lists 25 valid locales noindex,follow, canonical → /en/ ✅ Bug 1 fixed
AR /developers/tutorials/gasless-token/ (no AR md) canonical → /en/, no hreflang canonical → AR self, 26 hreflang tags ✅ Bug 2 fixed
DE /videos/privacy-is-existential/ (DE has md, FR doesn't) hreflang has 25 entries; fr correctly absent n/a ✅ partial-translation gating works
FR /videos/privacy-is-existential/ (FR md missing) canonical → /en/ n/a ✅ correctly falls through to EN
AR /wallets/ (pure-intl, no md) canonical → self, full hreflang same ✅ namespace fallback preserved

Notes:

  • All preview pages show noindex,nofollow because IS_PRODUCTION_DEPLOY=false for deploy previews — the signal we're reading is canonical and hreflang, which still reflect translation status correctly.
  • pnpm test:unit tests/unit/i18n/translation-registry.spec.ts → 6/6 pass.

Sitemap diff (preview vs prod)

Preview: 9,535 <url> entries — Prod: 13,838.

Removed (4,375 paths)

Group Count Attribution
/<locale>/apps/<slug>/ across 24 non-EN locales 4,176 ⚠️ Not the PR — build-time data difference. The preview's getAppsData() returned 7 app slugs (aave, ens, fluidkey, opensea, particle-network, remix, snapshot, zora); prod has ~180. The PR doesn't touch app routing — old and new code both call getAppsData() the same way. Worth a fresh preview build before merge to rule out a stale data-layer snapshot.
EN /apps/<slug>/ 174 ⚠️ Same data-layer reason as above
/<locale>/developers/tutorials/gasless-token/ 24 ✅ Bug 2 fix — English-only tutorial no longer claimed in non-EN locales
/fr/videos/privacy-is-existential/ 1 ✅ Partial-translation gating — FR md missing

Added (72 paths)

3 routes × 24 non-EN locales — same class as Bug 1: pages with translated markdown but no per-locale namespace JSON. Old resolver dropped them; new content-first resolver picks them up.

  • /<locale>/contributing/translation-program/translatathon/
  • /<locale>/contributing/translation-program/translatathon/details/
  • /<locale>/eth/supply/

TL;DR

PR-attributable diff:

  • −25 paths correctly removed (gasless-token + 1 video)
  • +72 paths correctly added (translatathon, eth/supply)
  • −4,350 /apps/* paths are environmental, not from this PR

LGTM 👍

@pettinarip pettinarip merged commit 7c13caa into dev May 8, 2026
10 checks passed
@pettinarip pettinarip deleted the fix/sitemap-noindex-translation-registry branch May 8, 2026 14:24
@pettinarip pettinarip mentioned this pull request May 8, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

tooling 🔧 Changes related to tooling of the project

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant