Skip to content

i18n: translation pipeline (all languages)#18273

Merged
wackerow merged 25 commits into
devfrom
intl/pending-dev
May 25, 2026
Merged

i18n: translation pipeline (all languages)#18273
wackerow merged 25 commits into
devfrom
intl/pending-dev

Conversation

@wackerow
Copy link
Copy Markdown
Member

Automated Translations

This PR contains translations managed by the intl pipeline.
Each run appends a summary below.


Run: 2026-05-25 10:50:36 UTC

  • Languages: ar, bn, cs, de, es, fr, hi, id, it, ja, ko, mr, pl, pt-br, ru, sw, ta, te, tr, uk, ur, vi, zh-tw, zh
  • Files: 144 (0 MD, 144 JSON)
  • Mode: auto
  • View workflow run

wackerow and others added 25 commits May 25, 2026 12:50
Co-Authored-By: Gemini <gemini@google.com>
Co-Authored-By: Gemini <gemini@google.com>
Co-Authored-By: Gemini <gemini@google.com>
Co-Authored-By: Gemini <gemini@google.com>
Co-Authored-By: Gemini <gemini@google.com>
Co-Authored-By: Gemini <gemini@google.com>
Co-Authored-By: Gemini <gemini@google.com>
Co-Authored-By: Gemini <gemini@google.com>
Co-Authored-By: Gemini <gemini@google.com>
Co-Authored-By: Gemini <gemini@google.com>
Co-Authored-By: Gemini <gemini@google.com>
Co-Authored-By: Gemini <gemini@google.com>
Co-Authored-By: Gemini <gemini@google.com>
Co-Authored-By: Gemini <gemini@google.com>
Co-Authored-By: Gemini <gemini@google.com>
Co-Authored-By: Gemini <gemini@google.com>
Co-Authored-By: Gemini <gemini@google.com>
Co-Authored-By: Gemini <gemini@google.com>
Co-Authored-By: Gemini <gemini@google.com>
Co-Authored-By: Gemini <gemini@google.com>
Co-Authored-By: Gemini <gemini@google.com>
Co-Authored-By: Gemini <gemini@google.com>
Co-Authored-By: Gemini <gemini@google.com>
Co-Authored-By: Gemini <gemini@google.com>
@wackerow wackerow requested a review from pettinarip as a code owner May 25, 2026 10:50
@netlify
Copy link
Copy Markdown

netlify Bot commented May 25, 2026

Deploy Preview for ethereumorg ready!

Name Link
🔨 Latest commit bca78df
🔍 Latest deploy log https://app.netlify.com/projects/ethereumorg/deploys/6a142980e163d4000877baaa
😎 Deploy Preview https://deploy-preview-18273.ethereum.it
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.
Lighthouse
Lighthouse
7 paths audited
Performance: 67 (🟢 up 2 from production)
Accessibility: 96 (no change from production)
Best Practices: 100 (no change from production)
SEO: 98 (🔴 down 1 from production)
PWA: 59 (no change from production)
View the detailed breakdown and full score reports

To edit notification comments on pull requests, go to your Netlify project configuration.

@github-actions github-actions Bot added the translation 🌍 This is related to our Translation Program label May 25, 2026
@github-actions
Copy link
Copy Markdown
Contributor

🌐 Translation review started. View progress

Copy link
Copy Markdown
Contributor

@claude claude Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Translation Quality Review

PR: #18273
Branch HEAD: bca78df5c5
Languages: 24 (ar, bn, cs, de, es, fr, hi, id, it, ja, ko, mr, pl, pt-br, ru, sw, ta, te, tr, uk, ur, vi, zh-tw, zh)
Files reviewed: 143 JSON files (6 per language across 24 langs, minus 1 for ko which lacks app-subcategories.json)
Scope: Full PR diff (no prior LLM review)
Date: 2026-05-25
Fixes: No fixes applied (review-only)

Summary

This PR updates 24 language JSON files (src/intl/{lang}/{app-subcategories,component-whitepaper,page-bug-bounty,page-start,table,template-usecase}.json). Reviewed via 24 parallel per-language sub-agents comparing each translation against the English source.

Overall average score: ~9.3/10. Most languages are ship-ready. 11 critical issues total, concentrated in 6 languages.

Summary by Language

Language Files Quality Score Issues
ar 6 9.2/10 0 critical, 5 warnings
bn 6 9.2/10 1 critical, 3 warnings
cs 6 9.8/10 0 critical, 4 warnings
de 6 9.0/10 0 critical, 4 warnings
es 6 9.2/10 0 critical, 3 warnings
fr 6 9.6/10 0 critical, 5 warnings
hi 6 9.4/10 0 critical, 3 warnings
id 6 9.8/10 0 critical, 5 warnings
it 6 9.4/10 0 critical, 7 warnings
ja 6 10.0/10 0 critical, 0 warnings
ko 5 9.4/10 0 critical, 1 warning
mr 6 8.6/10 3 critical, 3 warnings
pl 6 9.8/10 0 critical, 3 warnings
pt-br 6 9.7/10 0 critical, 4 warnings
ru 6 9.4/10 0 critical, 3 warnings
sw 6 8.8/10 2 critical, 4 warnings
ta 6 7.4/10 3 critical, 5 warnings
te 6 8.6/10 1 critical, 5 warnings
tr 6 9.8/10 0 critical, 2 warnings
uk 6 9.0/10 1 critical, 4 warnings
ur 6 9.6/10 0 critical, 4 warnings
vi 6 9.4/10 0 critical, 3 warnings
zh-tw 6 9.2/10 0 critical, 3 warnings
zh 6 9.4/10 0 critical, 2 warnings

Critical Issues (Must Fix)

Language File Key Issue
bn src/intl/bn/page-bug-bounty.json page-upgrades-bug-bounty-not-included-li-4 Acronym JSON-RPC transliterated to Bengali (জেসন-আরপিসি); must stay Latin
mr src/intl/mr/page-bug-bounty.json page-upgrades-bug-bounty-not-included-li-4 Acronym JSON-RPC transliterated to Devanagari (जेसॉन-आरपीसी); must stay Latin
mr src/intl/mr/template-usecase.json template-usecase-dropdown-desci Acronym DeSci transliterated (डीसाय); must stay Latin
mr src/intl/mr/template-usecase.json template-usecase-dropdown-refi Acronym ReFi transliterated (रेफाय); must stay Latin
sw src/intl/sw/page-bug-bounty.json bug-bounty-faq-q1-content-4 "client" mistranslated as mteja (customer) instead of computing klijenti
sw src/intl/sw/page-bug-bounty.json page-upgrades-bug-bounty-severity-critical-li-5 "clients" mistranslated as wateja (customers) instead of klijenti
ta src/intl/ta/page-bug-bounty.json page-upgrades-bug-bounty-not-included-li-4 Acronym JSON-RPC transliterated to Tamil (ஜேசன்-ஆர்பிசி); must stay Latin
ta src/intl/ta/template-usecase.json template-usecase-dropdown-desci Acronym DeSci transliterated (டெஸ்சி); must stay Latin
ta src/intl/ta/template-usecase.json template-usecase-dropdown-refi Acronym ReFi transliterated (ரெஃபை); must stay Latin
te src/intl/te/page-bug-bounty.json page-upgrades-bug-bounty-not-included-li-4 Acronym JSON-RPC transliterated to Telugu (జేసన్-ఆర్‌పీసీ); must stay Latin
uk src/intl/uk/page-bug-bounty.json page-upgrades-bug-bounty-client-bugs-desc-2 Inconsistent brand handling. Client names transliterated to Cyrillic but Geth left Latin in same sentence; align by reverting all client names to Latin

Pattern: The dominant critical class (8 of 11) is acronym transliteration into non-Latin scripts in Indic and Bengali languages, specifically JSON-RPC in page-bug-bounty.json and DeSci/ReFi in template-usecase.json. This looks like a Crowdin TM artifact that the sanitizer acronym-protection step likely missed for these specific tokens. Worth investigating as a sanitizer fix.

Recurring Warnings (Cross-Language)

  1. Acronym AI localized to IA in de, fr, it, pt-br (KI in de). Standard local convention in those languages but technically deviates from the "must stay Latin" rule. Worth a project policy decision.
  2. Client brand names transliterated for execution/consensus clients (Besu, Erigon, Nethermind, Reth, Lighthouse, Lodestar, Nimbus, Teku, Prysm) in non-Latin-script languages. Permitted by spec, but several languages mix Latin Geth with transliterated siblings inside the same sentence (page-upgrades-bug-bounty-client-bugs-desc-2). Consistent treatment recommended.
  3. Tone/register inconsistency in page-bug-bounty.json for de (Sie vs du) and es (usted vs tú). Mixed within the same file.
  4. Plural acronym markers dropped (NFTs to NFT, DAOs to DAO, RWAs to RWA) in cs. Idiomatic but worth noting.
  5. Twitter rendered phonetically (推特, ट्विटर, ٹوئٹر, ツイッター). Acceptable per transliteration policy; flagged only because the platform rebranded to X.

Notable per-language observations

  • ja: 10.0/10. Zero critical, zero warnings. All Katakana transliterations correct, no semantic inversions, polite register consistent.
  • ta: 7.4/10. Lowest score. Three different spellings of "Ethereum" inside page-bug-bounty.json (எத்திரியம் 8x, எத்தீரியம் 7x, எத்தேரியம் 1x). Plus the JSON-RPC/DeSci/ReFi transliterations.
  • sw. The two mteja/wateja ("customer") slip-ups are particularly damaging because the rest of the file correctly uses klijenti for the computing sense; an easy mechanical fix.
  • bn, mr, ta, te. Share an identical pattern of JSON-RPC being Devanagari/Bengali/Tamil/Telugu-fied at the same key (page-upgrades-bug-bounty-not-included-li-4). Suggests the Crowdin TM source for that single string leaked across these locales.
  • No semantic inversions (PoW vs PoS, validator vs miner, mainnet vs testnet) found in any language. No domain transliterations (ethereum.org is intact everywhere). No placeholder corruption.

Auto-fix status

This review ran in GitHub Actions and is not configured to commit fixes back to the PR branch (the workflow does not check out the PR worktree). The above critical issues were not auto-fixed.


To apply fixes, run locally:

/review-translations --pr=18273 --fix

That will reopen the worktree, apply the critical fixes, and stage them for a follow-up commit.


Reviewed by Claude Code. 24 parallel per-language agents, full PR diff scope.

Copy link
Copy Markdown
Collaborator

@myelinated-wackerow myelinated-wackerow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Translation Quality Review (re-evaluated against ETHGlossary)

PR: #18273
Branch HEAD: bca78df5c5
Languages: 24 (ar, bn, cs, de, es, fr, hi, id, it, ja, ko, mr, pl, pt-br, ru, sw, ta, te, tr, uk, ur, vi, zh-tw, zh)
Files reviewed: 143 JSON files (6 per language across 24 langs, minus 1 for ko)
Scope: Full PR diff, ETHGlossary-driven re-evaluation
Date: 2026-05-25
Fixes: No fixes applied (review-only)

Summary

This review re-evaluates the prior automated review against ETHGlossary as the authoritative source for terminology. The prior review's findings were heuristic and did not query ETHGlossary, leading to several false-positive criticals.

Net result: 7 of the prior review's 11 "criticals" are false positives. Two real semantic concerns remain (Swahili) and four consistency warnings (DeSci/ReFi transliteration in Indic languages).

Overall: approving for merge. No glossary-violation criticals remain. The Swahili semantic issue is documented as a warning for native-speaker review post-merge.

Re-evaluation of prior critical findings

ETHGlossary endpoints used: GET /api/v1/style-guide/{term}, GET /api/v1/translations/{lang}/{term}, POST /api/v1/filter.

# Prior critical Re-evaluation Verdict
1 bn JSON-RPC transliterated to জেসন-আরপিসি ETHGlossary bn entry: "term": "জেসন-আরপিসি" (prose context) — exact match False positive
2 mr JSON-RPC transliterated to जेसॉन-आरपीसी ETHGlossary mr entry: "term": "जेसॉन-आरपीसी" (prose context) — exact match False positive
3 mr DeSci transliterated to डीसाय ETHGlossary has English entry with scriptRule: "translate" but no per-language mr entry; 6/24 languages transliterate, 18/24 keep Latin Warning (consistency drift, not glossary violation)
4 mr ReFi transliterated to रेफाय Same as DeSci Warning
5 sw "client" → mteja (customer) ETHGlossary has no standalone "client" entry; mteja is the everyday word for "customer" in Swahili. Computing-client sense uncertain without native review Valid concern — kept as warning per author's call (no confident Swahili replacement available)
6 sw "clients" → wateja Same as #5 Valid concern (warning)
7 ta JSON-RPC transliterated to ஜேசன்-ஆர்பிசி ETHGlossary ta entry: "term": "ஜேசன்-ஆர்பிசி"exact match False positive
8 ta DeSci transliterated to டெஸ்சி Same as #3 Warning
9 ta ReFi transliterated to ரெஃபை Same as #3 Warning
10 te JSON-RPC transliterated to జేసన్-ఆర్‌పీసీ ETHGlossary te entry: "term": "జేసన్-ఆర్‌పీసీ" (confidence: medium, note: "Phonetic transliteration used as per rules.") — exact match False positive
11 uk inconsistent client-name handling (Geth Latin while siblings Cyrillic) ETHGlossary uk entries: Besu→Бесу, Erigon→Ерігон, Nethermind→Незермайнд, Reth→Рет, Lighthouse→Лайтхаус, Lodestar→Лодстар, Nimbus→Німбус, Teku→Теку, Prysm→Призмall Cyrillic, but Go Ethereum (Geth)→Go Ethereum (Geth)explicitly Latin. The file matches glossary 10/10. False positive

Additional finding (missed by prior review)

Language File Key Issue
sw src/intl/sw/page-bug-bounty.json page-upgrades-bug-bounty-severity-critical-li-2 Third "client→mteja" instance — same semantic concern as the two found previously. Kept as warning.

Bonus: Tamil "Ethereum" three-spelling pattern (called out as observation in prior review)

The prior review noted three spellings of Ethereum in ta/page-bug-bounty.json. After re-checking ETHGlossary, this matches glossary policy:

  • Standalone Ethereum: எத்திரியம் (8 occurrences in file)
  • Ethereum Foundation: எத்தீரியம் அறக்கட்டளை (8 occurrences, all paired with Foundation)
  • Ethereum Mainnet: எத்தேரியம் முதன்மை வலைப்பின்னல் (1 occurrence, paired with Mainnet)

The translation correctly uses each contextual spelling. (Worth a follow-up to verify ETHGlossary's intentional choice of three Tamil spellings; if unintentional on glossary side, that's a glossary fix, not a translation fix.)

Consistency warnings (DeSci / ReFi)

template-usecase-dropdown-desci and template-usecase-dropdown-refi across 24 languages:

  • 18 languages keep the parenthetical acronym in Latin: (DeSci) / (ReFi)
  • 6 languages transliterate: bn, hi (ReFi only), mr, ta, te, ur

ETHGlossary has English entries (decentralized-science-desci, regenerative-finance-refi) with scriptRule: "translate" but no per-language entries. Both forms are permissible; the consistent majority pattern would be to keep the parenthetical in Latin for brand recognition. Not a blocking issue — worth a future ETHGlossary entry to make policy unambiguous.

Semantic spot-checks (PoS / PoW / mainnet / validator)

Spot-checked PoS/PoW translations across 11 high-coverage languages (de, fr, it, ja, ko, pl, ru, tr, vi, zh, zh-tw) in page-bug-bounty.json. No semantic inversions found. Proof-of-stake and proof-of-work are correctly distinguished in every language checked, including languages that translate the full phrase (e.g., ko 지분 증명 (PoS) / 작업증명 (PoW), ja プルーフ・オブ・ステーク (PoS) / プルーフ・オブ・ワーク (PoW)).

Updated quality scores

Removing false-positive criticals raises most scores. Scores below reflect ETHGlossary-grounded evaluation.

Language Files Quality Issues (after re-eval)
ar 6 9.4/10 0 critical, 5 warnings (unchanged)
bn 6 9.6/10 0 critical, 4 warnings (JSON-RPC reclassified)
cs 6 9.8/10 0 critical, 4 warnings (unchanged)
de 6 9.0/10 0 critical, 4 warnings (unchanged)
es 6 9.2/10 0 critical, 3 warnings (unchanged)
fr 6 9.6/10 0 critical, 5 warnings (unchanged)
hi 6 9.4/10 0 critical, 3 warnings (unchanged)
id 6 9.8/10 0 critical, 5 warnings (unchanged)
it 6 9.4/10 0 critical, 7 warnings (unchanged)
ja 6 10.0/10 0 critical, 0 warnings (unchanged)
ko 5 9.4/10 0 critical, 1 warning (unchanged)
mr 6 9.4/10 0 critical, 5 warnings (JSON-RPC reclassified; DeSci/ReFi remain warnings)
pl 6 9.8/10 0 critical, 3 warnings (unchanged)
pt-br 6 9.7/10 0 critical, 4 warnings (unchanged)
ru 6 9.4/10 0 critical, 3 warnings (unchanged)
sw 6 8.6/10 0 critical, 6 warnings (mteja×3 reclassified to warning pending native review; previously flagged as critical)
ta 6 9.0/10 0 critical, 7 warnings (JSON-RPC and "3 spellings" both glossary-compliant; DeSci/ReFi remain warnings)
te 6 9.0/10 0 critical, 5 warnings (JSON-RPC reclassified)
tr 6 9.8/10 0 critical, 2 warnings (unchanged)
uk 6 9.6/10 0 critical, 4 warnings (Geth pattern reclassified — glossary-compliant)
ur 6 9.6/10 0 critical, 4 warnings (unchanged)
vi 6 9.4/10 0 critical, 3 warnings (unchanged)
zh-tw 6 9.2/10 0 critical, 3 warnings (unchanged)
zh 6 9.4/10 0 critical, 2 warnings (unchanged)

Overall average: ~9.4/10. No glossary-violation criticals across any language.

Recommendations beyond this PR

  1. ETHGlossary coverage — Add per-language entries for DeSci, ReFi, and client (standalone) so future reviews have unambiguous policy.
  2. /review-translations slash-command fix — The doc says POST /filter accepts text: but the API actually expects content:. Worth a follow-up doc fix so future agent-run reviews don't silently fail to fetch glossary terms.
  3. Native sw reviewpage-bug-bounty.json has three occurrences of mteja/wateja for computing "client" that warrant a Swahili-fluent reviewer's call before the next translation push.

Re-reviewed by Claude Code with ETHGlossary as authoritative source. The original review's claude[bot] run did not query ETHGlossary; this review fixes that.

@wackerow wackerow merged commit 2b6a038 into dev May 25, 2026
17 checks passed
@wackerow wackerow deleted the intl/pending-dev branch May 25, 2026 17:08
@pettinarip pettinarip mentioned this pull request May 28, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

translation 🌍 This is related to our Translation Program

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants