fix: sitemap issues and deduplicate#17639
Conversation
✅ Deploy Preview for ethereumorg ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
myelinated-wackerow
left a comment
There was a problem hiding this comment.
Well-researched PR that addresses all 5 issues from #17260. A few notes:
Build is failing — deploy preview didn't make it. Worth investigating before further review. Could be the new import of DEV_TOOL_CATEGORY_SLUG_LIST from the data-layer into translationRegistry.ts introducing a build-order or circular dependency issue.
hreflang locale filtering may be too restrictive: isLocaleValidISO639_1 filters the alternates list, but hreflang supports BCP 47 tags (e.g., zh-TW, pt-BR), not just ISO 639-1. Regional locales like zh-tw and pt-br would appear in sitemap URLs but be absent from the alternates.languages block. Worth confirming this is intentional.
The dedup approach is cleaner than the old hack — replacing the homepage-specific skip with a generic seenUrls Set is the right call. Removing the unreliable lastModified/priority/changeFrequency fields is also correct per Google's own docs.
Overlap note: PR #17563 addresses a subset of this (homepage only) with a simpler hardcoded approach. This PR supersedes it entirely if the build gets fixed.
Please fix the build and we can take another look.
🤖 Reviewed by Claude / claude-opus-4-6
|
should be fixed now! thank you for the quick review! :) |
myelinated-wackerow
left a comment
There was a problem hiding this comment.
Thanks @flatsponge! Appreciate the fixes.
Approving this PR. We've merged the latest dev into this branch -- just waiting on the Netlify build to confirm everything is still green before merging.
Verification against the deploy preview sitemap:
- Homepage (
/) is present with full hreflang alternates includingx-default - All 7
/developers/tools/[category]/pages are present with locale variants - hreflang alternates use proper BCP 47 tags (
pt-br,zh-twcorrectly included) lastModified,changeFrequency, andpriorityfields are fully removed (Google ignores these anyway)- No duplicate
/en/root entry -- theseenUrlsdedup works correctly - URL count is ~6,614 vs production's ~6,857 -- the decrease is expected from deduplication of default-locale entries
Architecture notes:
- The shared
src/data/developerTools.tsas single source of truth for category slugs is a clean refactor - The
/developers/tools/namespace mapping addition intranslations.tsfixes locale detection for those routes - Removing the
isLocaleValidISO639_1filter frommetadata.tsis correct -- hreflang supports BCP 47, not just ISO 639-1
This PR supersedes #17563 and #17567 -- both can be closed once this merges.
Reviewed by Claude / claude-opus-4-6
|
@all-contributors add @flatsponge for code |
|
@flatsponge already contributed before to code |

Summary
Fix sitemap so search engines and AI agents can reliably consume
https://ethereum.org/sitemap.xmlagain.Closes #17260
What changed
/for default locale)./developers/tools/[category]/routes to sitemap generation (all 7 categorypages).
lastModified(previously deploy-time for every URL)changeFrequencypriorityalternates.languages, includingx-default./developers/tools/*to use the correct namespace(
page-developers-tools).Why
lastModifiedmade freshness metadata untrustworthy.