Skip to content

fix: prevent dev/staging subdomains from being indexed by search engines#17741

Merged
pettinarip merged 3 commits into
devfrom
fix/seo-noindex-subdomains
Mar 10, 2026
Merged

fix: prevent dev/staging subdomains from being indexed by search engines#17741
pettinarip merged 3 commits into
devfrom
fix/seo-noindex-subdomains

Conversation

@pettinarip
Copy link
Copy Markdown
Member

@pettinarip pettinarip commented Mar 9, 2026

Summary

  • Adds X-Robots-Tag: noindex, nofollow HTTP headers in netlify.toml for branch deploys, deploy previews, and the dev/staging branches
  • Sets NEXT_PUBLIC_SITE_URL per branch (devhttps://dev.ethereum.org, staginghttps://staging.ethereum.org) so canonical URLs and noindex meta tags resolve correctly at build time
  • Extracts IS_PRODUCTION_DEPLOY constant from SITE_URL hostname check, used by both robots.ts and metadata.ts to gate indexing

Problem

SITE_URL was resolving to https://ethereum.org on dev/staging deploys at SSR runtime because DEPLOY_PRIME_URL (a build-time-only Netlify env var) was unavailable, falling through to URL which always points to the production domain. This caused:

  1. "Duplicate without user-selected canonical" — dev/staging pages had canonical URLs pointing to ethereum.org, creating cross-domain duplicate signals
  2. "Indexed, though blocked by robots.txt"robots.txt correctly blocked crawling but pages lacked <meta name="robots" content="noindex">, so Google indexed URLs it couldn't crawl

Fix layers

Layer Mechanism Protects against
X-Robots-Tag header HTTP header via netlify.toml Indexing of any resource (HTML, PDF, images)
NEXT_PUBLIC_SITE_URL per branch Build-time env var Wrong canonical URLs, missing noindex meta tags
IS_PRODUCTION_DEPLOY check Code-level hostname comparison Meta robots tag omission on non-production

Test plan

  • Verify dev.ethereum.org returns X-Robots-Tag: noindex, nofollow header
  • Verify dev.ethereum.org pages have <meta name="robots" content="noindex, nofollow">
  • Verify dev.ethereum.org canonical URLs point to dev.ethereum.org, not ethereum.org
  • Verify staging.ethereum.org has the same protections
  • Verify ethereum.org (production) is unaffected — no noindex, correct canonicals

@netlify
Copy link
Copy Markdown

netlify Bot commented Mar 9, 2026

Deploy Preview for ethereumorg ready!

Name Link
🔨 Latest commit e0ed629
🔍 Latest deploy log https://app.netlify.com/projects/ethereumorg/deploys/69af0c017ad4f00008bf1e23
😎 Deploy Preview https://deploy-preview-17741.ethereum.it
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.
Lighthouse
Lighthouse
7 paths audited
Performance: 55 (🟢 up 2 from production)
Accessibility: 94 (🟢 up 1 from production)
Best Practices: 100 (no change from production)
SEO: 99 (no change from production)
PWA: 59 (no change from production)
View the detailed breakdown and full score reports

To edit notification comments on pull requests, go to your Netlify project configuration.

@github-actions github-actions Bot added config ⚙️ Changes to configuration files tooling 🔧 Changes related to tooling of the project labels Mar 9, 2026
Copy link
Copy Markdown
Collaborator

@myelinated-wackerow myelinated-wackerow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice PR -- the layered approach (HTTP header + build-time env + code-level check) is the right strategy here. Two suggestions:

1. Prefer const with IIFE over export let

Minor, but export let allows external reassignment. An IIFE keeps it immutable:

export const IS_PRODUCTION_DEPLOY = (() => {
  try {
    return new URL(SITE_URL).hostname === "ethereum.org"
  } catch {
    return false
  }
})()

2. Code-level protection may leak on arbitrary branch deploys and deploy previews

I'm fairly confident (though not 100%) that for branches other than dev and staging, the code-level layer doesn't fire correctly. Since NEXT_PUBLIC_SITE_URL is only set for those two branches, arbitrary branch deploys and deploy previews would fall through the SITE_URL chain. DEPLOY_PRIME_URL / DEPLOY_URL / URL are build-time-only Netlify vars -- I believe they're unavailable at SSR runtime in Netlify Functions -- so SITE_URL would resolve to "https://ethereum.org", making IS_PRODUCTION_DEPLOY true. That means no noindex meta tag, robots.txt allows crawling, and canonical URLs point to ethereum.org.

The X-Robots-Tag HTTP header from context.branch-deploy / context.deploy-preview does block indexing at the HTTP level, so these deploys aren't unprotected. But the canonical URL leak could still send confusing duplicate signals to crawlers.

One possible fix: rather than deriving IS_PRODUCTION_DEPLOY from SITE_URL, use a dedicated boolean that only production sets:

[context.production.environment]
  NEXT_PUBLIC_IS_PRODUCTION = "true"
export const IS_PRODUCTION_DEPLOY = process.env.NEXT_PUBLIC_IS_PRODUCTION === "true"

Since it's NEXT_PUBLIC_*, it gets inlined at build time. It defaults to false everywhere except production, and doesn't touch SITE_URL -- so no risk of breaking AB testing, OG images, JSON-LD, or other SITE_URL-dependent functionality.

That said, this is a suggestion -- the HTTP header layer already covers the indexing concern, so this is about closing the canonical URL gap for non-dev/staging deploys. Your call on whether that's worth the extra env var.


Reviewed by Claude Opus 4.6

Use NEXT_PUBLIC_CONTEXT (already inlined at build time via next.config.js)
instead of parsing SITE_URL hostname. This ensures IS_PRODUCTION_DEPLOY is
false on all non-production deploys (branch deploys, deploy previews),
closing the canonical URL leak for arbitrary branch deploys where SITE_URL
would fall through to "https://ethereum.org".
@pettinarip
Copy link
Copy Markdown
Member Author

@wackerow thanks. I refactor the code to stop using the SITE_URL to calc production env flag...I just use the flag that we already have available. Safer and simpler.

Move the SITE_URL fallback chain (NEXT_PUBLIC_SITE_URL → DEPLOY_PRIME_URL
→ DEPLOY_URL → URL) into next.config.js env config so it gets resolved at
build time and inlined by webpack. This ensures SSR pages on deploy
previews and branch deploys use the correct deploy-specific URL instead of
falling through to "https://ethereum.org".
@pettinarip pettinarip merged commit 48e5115 into dev Mar 10, 2026
6 checks passed
@pettinarip pettinarip deleted the fix/seo-noindex-subdomains branch March 10, 2026 16:02
@wackerow wackerow mentioned this pull request Mar 13, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

config ⚙️ Changes to configuration files tooling 🔧 Changes related to tooling of the project

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants