i18n: translation pipeline (all languages)#18145
Conversation
Co-Authored-By: Gemini <gemini@google.com>
Co-Authored-By: Gemini <gemini@google.com>
Co-Authored-By: Gemini <gemini@google.com>
Co-Authored-By: Gemini <gemini@google.com>
Co-Authored-By: Gemini <gemini@google.com>
Co-Authored-By: Gemini <gemini@google.com>
Co-Authored-By: Gemini <gemini@google.com>
Co-Authored-By: Gemini <gemini@google.com>
Co-Authored-By: Gemini <gemini@google.com>
Co-Authored-By: Gemini <gemini@google.com>
Co-Authored-By: Gemini <gemini@google.com>
Co-Authored-By: Gemini <gemini@google.com>
Co-Authored-By: Gemini <gemini@google.com>
Co-Authored-By: Gemini <gemini@google.com>
Co-Authored-By: Gemini <gemini@google.com>
Co-Authored-By: Gemini <gemini@google.com>
Co-Authored-By: Gemini <gemini@google.com>
Co-Authored-By: Gemini <gemini@google.com>
Co-Authored-By: Gemini <gemini@google.com>
Co-Authored-By: Gemini <gemini@google.com>
Co-Authored-By: Gemini <gemini@google.com>
Co-Authored-By: Gemini <gemini@google.com>
Co-Authored-By: Gemini <gemini@google.com>
Co-Authored-By: Gemini <gemini@google.com>
✅ Deploy Preview for ethereumorg ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
|
🌐 Translation review started. View progress |
There was a problem hiding this comment.
(Superseded — see the updated review submitted at the same commit d83b890.)
This review entry is a stale artifact from an early heredoc test during the GitHub Actions run. The actual translation quality review is the next review on this PR.
There was a problem hiding this comment.
Translation Quality Review
PR: #18145
Branch HEAD: d83b890d62
Languages: ar, bn, cs, de, es, fr, hi, id, it, ja, ko, mr, pl, pt-br, ru, sw, ta, te, tr, uk, ur, vi, zh, zh-tw (24)
Files reviewed: 48 (src/intl/{LANG}/glossary.json + learn-quizzes.json per language)
Date: 2026-05-07
Fixes: No fixes applied (review-only)
Reviewed by 24 parallel sub-agents (one per language, model: sonnet). Scope: full PR diff (no prior LLM review on this PR). Each agent cross-referenced its locale against the English source, the per-language transliteration bank, prior-review knowledge base, and
localization-rules-by-language-group.md.The full per-language critical-issues table and detailed scores are posted as follow-up comments on this PR.
Summary by Language
| Language | Quality Score | Critical | Warnings |
|---|---|---|---|
| ar | 8.4/10 | 2 | 4 |
| bn | 8.5/10 | 2 | 4 |
| cs | 8.6/10 | 1 | 4 |
| de | 8.6/10 | 1 | 2 |
| es | 8.6/10 | 2 | 4 |
| fr | 8.4/10 | 2 | 2 |
| hi | 8.8/10 | 3 | 9 |
| id | 9.6/10 | 0 | 3 |
| it | 8.4/10 | 1 | 4 |
| ja | 9.4/10 | 0 | 3 |
| ko | 8.2/10 | 2 | 3 |
| mr | 9.6/10 | 0 | 4 |
| pl | 9.0/10 | 1 | 4 |
| pt-br | 9.2/10 | 0 | 4 |
| ru | 8.8/10 | 1 | 4 |
| sw | 8.8/10 | 0 | 5 |
| ta | 8.4/10 | 0 | 4 |
| te | 8.4/10 | 1 | 4 |
| tr | 9.0/10 | 1 | 2 |
| uk | 8.8/10 | 0 | 3 |
| ur | 9.2/10 | 0 | 4 |
| vi | 8.0/10 | 2 | 4 |
| zh | 8.4/10 | 2 | 4 |
| zh-tw | 8.2/10 | 4 | 3 |
| Total | avg 8.7 | 28 | 91 |
Highlights
- 8 languages clean (0 critical): id, ja, mr, pt-br, sw, ta, uk, ur
- Top scorers: id (9.6), mr (9.6), ja (9.4), pt-br (9.2), ur (9.2)
- Lowest score: vi (8.0) — Ethereum Foundation translated, Oracle calqued
- Major sanitizer escape leak:
it/glossary.json:ommer-definitioncontains literal<HTML-PLACEHOLDER-HTMLTAG-7ff424>...</HTML-PLACEHOLDER-HTMLTAG-7ff424>— will render as visible broken text in production
Top recurring pipeline issues
(l1)/(l2)lowercase parentheticals on layer terms in hi, mr, uk, bn (Crowdin TM artifact)wei/Weicasing inconsistency in denomination definitions across es, id, it, pt-brHTML-PLACEHOLDER-HTMLTAG-...token leak in it/glossary.json:ommer-definition (sanitizer bug)- Word truncation: ko
블록체(should be블록체인), trmikt.(should bemiktarı.) - Solidity
assertkeyword translated in cs, te - Off-bank client transliterations for ru (Nethermind), uk (Lodestar)
- zh / zh-tw: client software brands transliterated per the bank, but Chinese tech-writing convention keeps Latin — project-policy question, not a regression
To apply fixes, run locally:
/review-translations --pr=18145 --fix
This review ran from GitHub Actions and is review-only — the CI workflow isn't wired to commit fixes back. A maintainer should weigh in on the zh/zh-tw client-name policy before any auto-fix.
Detailed per-language critical-issues table and per-language summaries are in follow-up comments below.
Reviewed by Claude Code (24 parallel sub-agents, model: sonnet)
Translation Quality Review — Per-Language Critical Issues (detail)
ar (2 critical)
bn (2 critical)
cs (1 critical)
de (1 critical)
es (2 critical)
fr (2 critical)
hi (3 critical)
it (1 critical)
ko (2 critical)
pl (1 critical)
ru (1 critical)
te (1 critical)
tr (1 critical)
vi (2 critical)
zh (2 critical) — policy ambiguity
zh-tw (4 critical) — same policy ambiguity
— follow-up to PR review by Claude Code (24 parallel sub-agents, model: sonnet) |
Translation Quality Review — Detailed Scores (1/2: ar–ko)
ar — 8.4/10
Substantially improved over PR #17105 (5.2/10). No oracle-as-fortune-teller, POAP-as-Consumer-Protection-Office, state-as-nation-state, or block-as-barrier corruptions. No cross-script contamination. Only actionable issue is staking="التخزين" (storage) where the documented canonical Arabic term is "التحصيص"; internally consistent across both files but conflicts with the established standard. bn — 8.5/10
High overall quality; brand transliterations match the bank. The two fixes are: WETH ticker lowercased in cs — 8.6/10
Solid quality. Critical issue: de — 8.6/10
No brand-name violations or semantic inversions. Main issue: systematic Sie/du register mixing in es — 8.6/10
No brand violations or inversions. Two critical fixes: redundant DeSci acronym and "Wei" capitalization. Notable warning: tú/usted register mixing in learn-quizzes — wallets/security sections use formal "usted" while run-a-node/nfts use informal "tú". fr — 8.4/10
Two real translation errors: hi — 8.8/10
Critical: Faroese mislabeled as Farsi (entirely different language) in id — 9.6/10
No critical issues. Brand names, tickers, hrefs, MDX markup all preserved. Only minor warnings: "Wei" capitalization in denomination definitions, mid-sentence capitalization of "Ketersediaan" in three rollup explanations, "danksharding" lowercased in two scaling entries. it — 8.4/10
Critical sanitizer escape leak: ja — 9.4/10
No critical issues. learn-quizzes.json is fully clean. Minor glossary issues only: ko — 8.2/10
Two notable correctness errors: fraction inversion "3/2" (impossible — > 1) in consensus-definition, and word truncation "블록체" → should be "블록체인" in bridge-definition. Otherwise high-quality. — follow-up 1/2 to PR review by Claude Code |
Translation Quality Review — Detailed Scores (2/2: mr–zh-tw)
mr — 9.6/10
No critical issues. Brand transliterations to Devanagari correct, no domain transliterations, no inversions. Minor: Devanagari numeral "२" appears in two scaling labels where Western numerals are used elsewhere; "Proof-of-Authority" semantically translated while PoS/PoW are kept as transliterations. pl — 9.0/10
Critical: pt-br — 9.2/10
No critical issues. Main warning: terminology collision — both "Encryption" and "Cryptography" render as "Criptografia". Minor "wei"/"Wei" and "gas"/"gás" casing/accent inconsistencies. ru — 8.8/10
Critical: Nethermind transliterated as "Незермайнд" while bank specifies "Недермайнд". Warnings: Lodestar similarly off-bank; "permissionless" → "Общедоступный" (publicly accessible) blurs the no-permission meaning; Web1/Web3 stay Latin but Web2 rendered as "Веб2" within the same sentences. sw — 8.8/10
No critical issues. Spelling inconsistencies: "tokeni" appears as "tokani" / "tokini" in two keys; "fork" alternates between "mchepuo" and "mchepuko"; "False" rendered both as "Uongo" and "Si kweli". ta — 8.4/10
No critical issues. Three different "Ethereum" transliterations across the file ("எத்திரியம்", "எத்தேரியம்", "எத்தீரியம்") — bank designates "எத்தேரியம்" as primary; should be normalized. te — 8.4/10
Critical: tr — 9.0/10
Strong improvement over PR #17182 (7.7/10). The prior failure modes (Solidity→katillik, client→Müşteri, EHT/BSL transpositions) are absent. One critical: uk — 8.8/10
No critical issues. learn-quizzes.json is clean. Glossary has the same off-bank Nethermind ("Незермайнд" vs bank "Недермайнд") and Lodestar ("Лодстар" vs "Лоудстар") transliterations as Russian. Lowercase ur — 9.2/10
No critical issues. learn-quizzes.json is clean. No Hijri calendar usage, no domain transliterations, no ERC-۲۰-style numeral corruption of technical identifiers. Minor mixed-numeral warning on vi — 8.0/10
The PR #17176 untranslated-content issue is resolved — full coverage. Two new criticals: "The Ethereum Foundation" translated as "Tổ chức Ethereum" (other vi files preserve the English name); "Oracle" calqued as "Nguồn cấp dữ liệu" (data feed). Also: zh — 8.4/10
Core terminology (权益证明, 工作量证明, 验证者, 矿工, 状态, 智能合约) is correct semantic-calque throughout. Critical/policy: client software brands (Prysm/Teku/Nimbus/Lighthouse/Lodestar/Besu/Erigon/Nethermind) phonetically transliterated rather than kept Latin — but the transliteration bank itself contains those forms, so this is a project-policy question. Also: zh-tw — 8.2/10
Same client-name policy issue as zh (4 keys including Uniswap inconsistency). No Simplified-character leakage. — follow-up 2/2 to PR review by Claude Code |
This comment was marked as spam.
This comment was marked as spam.
|
Cerberus, block ajarcoronet5415-commits |
Hand-fixes for review-flagged criticals on this pipeline run, verified against ETHGlossary canonical where an entry exists. - bn/ens-definition: resolver spelling (resolver vs revolver) - de/learn-quizzes what-is-ethereum-2-d: lowercase second "Bitcoin" to match the in-text "kleines b" aside - es/glossary desci-definition: remove redundant (DeSci) parenthetical - es/glossary wei/finney/gwei/szabo: lowercase inline wei per canonical prose form - fr/glossary account-definition: "compte externe" matches eoa-term in same file - fr/learn-quizzes gas-5-d-label: tip = pourboire, not duplicate priority fee - hi/glossary doge-d-definition: Faroese, not Farsi - it/glossary ommer-definition: restore sanitizer placeholder to <a href="/glossary/#pow">...</a> - ko/glossary consensus-definition: 2/3 not 3/2 - ko/glossary bridge-definition: complete truncated word to blockchain - tr/glossary block-reward-definition: complete truncated word to miktari - vi/learn-quizzes what-is-ethereum-3-c: brand name "The Ethereum Foundation" Skipped six review-flagged items where ETHGlossary canonical confirms the existing translation (ar/staking, cs/assert, te/assert, pl/mint, hi/layer-1, hi/layer-2, vi/oracle). Reverted three pre-applied edits where ETHGlossary canonical contradicted the reviewer (ru/Nethermind, zh-tw/Uniswap x4, bn/weth). zh/zh-tw client-name transliteration is a project policy question, deferred. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> Co-Authored-By: wackerow <54227730+wackerow@users.noreply.github.com>
myelinated-wackerow
left a comment
There was a problem hiding this comment.
Translation Quality Review — Approval (Updated)
Follow-up to the prior review at
d83b890d62. Critical issues resolved at8ccbb26965.
After re-verifying every flagged item against ETHGlossary canonical via /api/v1/translations/{lang}/{term}, the prior critical count of 28 resolves to:
- 12 fixed in commit
8ccbb26965 - 16 confirmed canonical-conformant — translations match the high-confidence ETHGlossary entries; the prior "critical" classification was a false positive
Updated summary by language
| Language | Pre-fix | Post-fix | Critical | Warnings |
|---|---|---|---|---|
| ar | 8.4 | 9.0 | 0 | 4 |
| bn | 8.5 | 9.0 | 0 | 4 |
| cs | 8.6 | 9.2 | 0 | 4 |
| de | 8.6 | 9.0 | 0 | 2 |
| es | 8.6 | 9.2 | 0 | 4 |
| fr | 8.4 | 9.0 | 0 | 2 |
| hi | 8.8 | 9.2 | 0 | 9 |
| id | 9.6 | 9.6 | 0 | 3 |
| it | 8.4 | 9.4 | 0 | 4 |
| ja | 9.4 | 9.4 | 0 | 3 |
| ko | 8.2 | 9.2 | 0 | 3 |
| mr | 9.6 | 9.6 | 0 | 4 |
| pl | 9.0 | 9.4 | 0 | 4 |
| pt-br | 9.2 | 9.2 | 0 | 4 |
| ru | 8.8 | 9.4 | 0 | 4 |
| sw | 8.8 | 8.8 | 0 | 5 |
| ta | 8.4 | 8.4 | 0 | 4 |
| te | 8.4 | 9.0 | 0 | 4 |
| tr | 9.0 | 9.4 | 0 | 2 |
| uk | 8.8 | 8.8 | 0 | 3 |
| ur | 9.2 | 9.2 | 0 | 4 |
| vi | 8.0 | 8.8 | 0 | 4 |
| zh | 8.4 | 8.8 | 0 | 4 |
| zh-tw | 8.2 | 9.0 | 0 | 3 |
| Total | 8.7 | 9.1 | 0 | 91 |
What changed
Fixed in 8ccbb26965:
bn/glossary.json:ens-definition—রিভলভারগুলোর(revolvers) →রিজলভারগুলোর(resolvers)de/learn-quizzes.json:what-is-ethereum-2-d-explanation— secondBitcoin→bitcointo match the in-text "kleines b" asidees/glossary.json:desci-definition— removed redundant(DeSci)parentheticales/glossary.json— 4× inlineWei→weiper ETHGlossary canonical prose form (wei/finney/gwei/szabo)fr/glossary.json:account-definition—compte détenu par un tiers→compte externe(semantic correction; matcheseoa-termin the same file)fr/learn-quizzes.json:gas-5-d-label— duplicatefrais de priorité→pourboire(restores the English distractor intent)hi/glossary.json:doge-d-definition—फ़ारसी(Farsi) →फ़ैरोज़(Faroese)it/glossary.json:ommer-definition— restored sanitizer placeholder<HTML-PLACEHOLDER-HTMLTAG-7ff424>to<a href="/glossary/#pow">(would have rendered visible broken text in production)ko/glossary.json:consensus-definition—3/2→2/3(math fact)ko/glossary.json:bridge-definition—블록체→블록체인(truncation)tr/glossary.json:block-reward-definition—mikt.→miktarı.(truncation)vi/learn-quizzes.json:what-is-ethereum-3-c-{label,explanation}—Tổ chức Ethereum→The Ethereum Foundation(brand name)
Canonical-conformant (no change needed):
| Item | ETHGlossary canonical | Confidence |
|---|---|---|
ar/staking |
التخزين | high |
bn/weth |
র্যাপড ইথার (weth) — lowercase | high |
cs/assert |
asert | medium (English form acknowledged in notes) |
hi/layer-1, hi/layer-2 |
लेयर 1 (l1) / लेयर 2 (l2) | high |
pl/mint |
wybijać | high |
ru/Nethermind |
Незермайнд | high |
te/assert |
నిర్ధారించు | high |
vi/oracle |
nguồn cấp dữ liệu | high (notes defend it) |
zh-tw/Uniswap |
尤尼斯瓦普 | high (notes acknowledge English prevalence) |
zh/zh-tw/consensus-client, execution-client |
Phonetic transliterations | high (match transliteration banks) |
Notes
- The sanitizer placeholder leak in
it/glossary.jsonwas the only production-blocking item — it would have rendered literal<HTML-PLACEHOLDER-HTMLTAG-7ff424>...</HTML-PLACEHOLDER-HTMLTAG-7ff424>on the live site. Worth a tracking issue against the sanitizer's HTML preserve/restore step. - 91 warnings from the original review remain unaddressed — non-blocking, can ship and follow up.
- zh/zh-tw client-software brand names use phonetic forms per the canonical and transliteration banks. If project policy ever favors keeping Latin for software brands without an official Chinese rendering, that's a coordinated update to the canonical + bank — not a regression here.
Reviewed by Claude Code (Opus 4.7), canonical verification against ETHGlossary /api/v1/translations/{lang}/{term}
|
Cerberus, block ajarcoronet5415-commits |
|
|

Automated Translations
This PR contains translations managed by the intl pipeline.
Each run appends a summary below.
Run: 2026-05-07 18:40:11 UTC