Skip to content

i18n: translation pipeline (all languages)#18145

Merged
wackerow merged 28 commits into
devfrom
intl/pending-dev
May 12, 2026
Merged

i18n: translation pipeline (all languages)#18145
wackerow merged 28 commits into
devfrom
intl/pending-dev

Conversation

@wackerow
Copy link
Copy Markdown
Member

@wackerow wackerow commented May 7, 2026

Automated Translations

This PR contains translations managed by the intl pipeline.
Each run appends a summary below.


Run: 2026-05-07 18:40:11 UTC

  • Languages: ar, bn, cs, de, es, fr, hi, id, it, ja, ko, mr, pl, pt-br, ru, sw, ta, te, tr, uk, ur, vi, zh-tw, zh
  • Files: 48 (0 MD, 48 JSON)
  • Mode: auto
  • View workflow run

wackerow and others added 26 commits May 7, 2026 20:39
Co-Authored-By: Gemini <gemini@google.com>
Co-Authored-By: Gemini <gemini@google.com>
Co-Authored-By: Gemini <gemini@google.com>
Co-Authored-By: Gemini <gemini@google.com>
Co-Authored-By: Gemini <gemini@google.com>
Co-Authored-By: Gemini <gemini@google.com>
Co-Authored-By: Gemini <gemini@google.com>
Co-Authored-By: Gemini <gemini@google.com>
Co-Authored-By: Gemini <gemini@google.com>
Co-Authored-By: Gemini <gemini@google.com>
Co-Authored-By: Gemini <gemini@google.com>
Co-Authored-By: Gemini <gemini@google.com>
Co-Authored-By: Gemini <gemini@google.com>
Co-Authored-By: Gemini <gemini@google.com>
Co-Authored-By: Gemini <gemini@google.com>
Co-Authored-By: Gemini <gemini@google.com>
Co-Authored-By: Gemini <gemini@google.com>
Co-Authored-By: Gemini <gemini@google.com>
Co-Authored-By: Gemini <gemini@google.com>
Co-Authored-By: Gemini <gemini@google.com>
Co-Authored-By: Gemini <gemini@google.com>
Co-Authored-By: Gemini <gemini@google.com>
Co-Authored-By: Gemini <gemini@google.com>
Co-Authored-By: Gemini <gemini@google.com>
@netlify
Copy link
Copy Markdown

netlify Bot commented May 7, 2026

Deploy Preview for ethereumorg ready!

Name Link
🔨 Latest commit 8ccbb26
🔍 Latest deploy log https://app.netlify.com/projects/ethereumorg/deploys/6a03368b5bc1440008ca3ce2
😎 Deploy Preview https://deploy-preview-18145.ethereum.it
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.
Lighthouse
Lighthouse
7 paths audited
Performance: 67 (🟢 up 2 from production)
Accessibility: 96 (no change from production)
Best Practices: 100 (no change from production)
SEO: 98 (🔴 down 1 from production)
PWA: 59 (no change from production)
View the detailed breakdown and full score reports

To edit notification comments on pull requests, go to your Netlify project configuration.

@github-actions github-actions Bot added the translation 🌍 This is related to our Translation Program label May 7, 2026
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 7, 2026

🌐 Translation review started. View progress

Copy link
Copy Markdown
Contributor

@claude claude Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Superseded — see the updated review submitted at the same commit d83b890.)

This review entry is a stale artifact from an early heredoc test during the GitHub Actions run. The actual translation quality review is the next review on this PR.

Copy link
Copy Markdown
Contributor

@claude claude Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Translation Quality Review

PR: #18145
Branch HEAD: d83b890d62
Languages: ar, bn, cs, de, es, fr, hi, id, it, ja, ko, mr, pl, pt-br, ru, sw, ta, te, tr, uk, ur, vi, zh, zh-tw (24)
Files reviewed: 48 (src/intl/{LANG}/glossary.json + learn-quizzes.json per language)
Date: 2026-05-07
Fixes: No fixes applied (review-only)

Reviewed by 24 parallel sub-agents (one per language, model: sonnet). Scope: full PR diff (no prior LLM review on this PR). Each agent cross-referenced its locale against the English source, the per-language transliteration bank, prior-review knowledge base, and localization-rules-by-language-group.md.

The full per-language critical-issues table and detailed scores are posted as follow-up comments on this PR.

Summary by Language

Language Quality Score Critical Warnings
ar 8.4/10 2 4
bn 8.5/10 2 4
cs 8.6/10 1 4
de 8.6/10 1 2
es 8.6/10 2 4
fr 8.4/10 2 2
hi 8.8/10 3 9
id 9.6/10 0 3
it 8.4/10 1 4
ja 9.4/10 0 3
ko 8.2/10 2 3
mr 9.6/10 0 4
pl 9.0/10 1 4
pt-br 9.2/10 0 4
ru 8.8/10 1 4
sw 8.8/10 0 5
ta 8.4/10 0 4
te 8.4/10 1 4
tr 9.0/10 1 2
uk 8.8/10 0 3
ur 9.2/10 0 4
vi 8.0/10 2 4
zh 8.4/10 2 4
zh-tw 8.2/10 4 3
Total avg 8.7 28 91

Highlights

  • 8 languages clean (0 critical): id, ja, mr, pt-br, sw, ta, uk, ur
  • Top scorers: id (9.6), mr (9.6), ja (9.4), pt-br (9.2), ur (9.2)
  • Lowest score: vi (8.0) — Ethereum Foundation translated, Oracle calqued
  • Major sanitizer escape leak: it/glossary.json:ommer-definition contains literal <HTML-PLACEHOLDER-HTMLTAG-7ff424>...</HTML-PLACEHOLDER-HTMLTAG-7ff424> — will render as visible broken text in production

Top recurring pipeline issues

  1. (l1) / (l2) lowercase parentheticals on layer terms in hi, mr, uk, bn (Crowdin TM artifact)
  2. wei / Wei casing inconsistency in denomination definitions across es, id, it, pt-br
  3. HTML-PLACEHOLDER-HTMLTAG-... token leak in it/glossary.json:ommer-definition (sanitizer bug)
  4. Word truncation: ko 블록체 (should be 블록체인), tr mikt. (should be miktarı.)
  5. Solidity assert keyword translated in cs, te
  6. Off-bank client transliterations for ru (Nethermind), uk (Lodestar)
  7. zh / zh-tw: client software brands transliterated per the bank, but Chinese tech-writing convention keeps Latin — project-policy question, not a regression

To apply fixes, run locally:

/review-translations --pr=18145 --fix

This review ran from GitHub Actions and is review-only — the CI workflow isn't wired to commit fixes back. A maintainer should weigh in on the zh/zh-tw client-name policy before any auto-fix.

Detailed per-language critical-issues table and per-language summaries are in follow-up comments below.


Reviewed by Claude Code (24 parallel sub-agents, model: sonnet)

@claude
Copy link
Copy Markdown
Contributor

claude Bot commented May 7, 2026

Translation Quality Review — Per-Language Critical Issues (detail)

Follow-up to the PR review submitted at d83b890d62. The headline summary table is in the review; this comment expands the per-language critical-issue tables.

ar (2 critical)

File Key Issue Current → Expected
glossary.json staking-term Wrong canonical term: "التخزين" (storage) used; per the knowledge-base glossary doc the canonical Arabic term is "التحصيص" التخزين → التحصيص
learn-quizzes.json many *staking* / *staker* keys Same staking term issue propagates throughout the quiz file التخزين/المخزنون → التحصيص/المحصصون

bn (2 critical)

File Key Issue Current → Expected
glossary.json wrapped-token-definition Ticker WETH lowercased inside link text র‍্যাপড ইথার (weth)র‍্যাপড ইথার (WETH)
glossary.json ens-definition Semantic mistranslation: "resolvers" → "রিভলভারগুলোর" (revolvers/firearms) instead of "রিজলভারগুলোর" (DNS/contract resolvers) রিভলভারগুলোর → রিজলভারগুলোর

cs (1 critical)

File Key Issue Current → Expected
glossary.json assert-term Solidity programming keyword translated; the body uses assert() correctly in backticks but the term heading was localized "asert" → "assert"

de (1 critical)

File Key Issue Current → Expected
learn-quizzes.json what-is-ethereum-2-d-explanation Coin "Bitcoin" capitalized where English uses lowercase "bitcoin" — directly contradicts the very point the explanation is making about uppercase-B (the protocol) vs lowercase-b (the currency) "..., Bitcoin (kleines b) ist ihre native Kryptowährung" → "..., bitcoin (kleines b) ist ihre native Kryptowährung"

es (2 critical)

File Key Issue Current → Expected
glossary.json desci-definition Acronym repeats redundantly DeSci, o ciencia descentralizada (DeSci), es un movimiento...DeSci, o ciencia descentralizada, es un movimiento...
glossary.json wei-definition, finney-definition, gwei-definition, szabo-definition "Wei" capitalized inline where English uses lowercase wei; conflicts with internal #wei anchor target ... Wei = 1 ether... wei = 1 ether

fr (2 critical)

File Key Issue Current → Expected
learn-quizzes.json gas-5-d-label "tip" rendered as "frais de priorité" — duplicates priority-fee. Label reads "priority fee + priority fee" Frais de base + frais de priorité + frais de prioritéFrais de base + frais de priorité + pourboire
glossary.json account-definition EOA described as "compte détenu par un tiers" (third-party owned) — opposite of the user-controlled meaning. The eoa-term entry itself uses the correct "Compte externe" compte détenu par un tiers (EOA)compte externe (EOA)

hi (3 critical)

File Key Issue Current → Expected
glossary.json doge-d-definition "Faroese" mistranslated as "फ़ारसी" (Farsi/Persian) — entirely different language family फ़ारसी → फ़ैरोज़
glossary.json layer-1-term Lowercase parenthetical "(l1)" appended; should be omitted or uppercase लेयर 1 (l1)लेयर 1 or लेयर 1 (L1)
glossary.json layer-2-term Same issue लेयर 2 (l2)लेयर 2 or लेयर 2 (L2)

it (1 critical)

File Key Issue Current → Expected
glossary.json ommer-definition Sanitizer placeholder leaked into output — raw token will render as visible broken text in production. The HTML preserve/restore step left the literal placeholder instead of the original <a href> anchor <HTML-PLACEHOLDER-HTMLTAG-7ff424>Prova di lavoro (PoW)</HTML-PLACEHOLDER-HTMLTAG-7ff424> → restore <a href="/glossary/#pow">Prova di lavoro (PoW)</a>

ko (2 critical)

File Key Issue Current → Expected
glossary.json consensus-definition Fraction inversion: "3/2" (impossible — > 1) instead of "2/3" — the consensus claim "more than 3/2 of nodes" is mathematically impossible 네트워크에 있는 컴퓨터의 3/2 이상이... 2/3 이상이
glossary.json bridge-definition Word truncated mid-character: "블록체인" cut to "블록체" 블록체 브릿지는블록체인 브릿지는

pl (1 critical)

File Key Issue Current → Expected
glossary.json mint-term Term-heading rendered as infinitive verb "Wybijać" instead of noun/gerund (other term entries use noun forms) "Wybijać" → "Wybijanie"

ru (1 critical)

File Key Issue Current → Expected
glossary.json execution-client-definition Nethermind transliteration deviates from authoritative bank "Незермайнд" → "Недермайнд" (per transliterations/ru.json)

te (1 critical)

File Key Issue Current → Expected
glossary.json assert-term Solidity reserved keyword semantically translated as "నిర్ధారించు" (confirm/verify); the definition body uses assert() correctly in backticks "నిర్ధారించు" → "assert" or "అసర్ట్"

tr (1 critical)

File Key Issue Current → Expected
glossary.json block-reward-definition Definition cut off mid-word — "miktarı" truncated to "mikt." verilen Ether mikt.verilen Ether miktarı.

vi (2 critical)

File Key Issue Current → Expected
learn-quizzes.json what-is-ethereum-3-c-label (& -explanation) Brand name "The Ethereum Foundation" translated as "Tổ chức Ethereum"; site-wide convention (other vi files like common.json) preserves the English name "Tổ chức Ethereum" → "The Ethereum Foundation"
glossary.json oracle-term (& oracle-definition body) Established blockchain technical term "Oracle" calqued as "Nguồn cấp dữ liệu" (data feed) — narrows meaning and breaks consistency with the link slug /developers/docs/oracles/ "Nguồn cấp dữ liệu" → "Oracle"

zh (2 critical) — policy ambiguity

File Key Issue Current → Expected
glossary.json consensus-client-definition Software product brands phonetically transliterated; standard zh tech-writing convention for brands without an official Chinese name is to keep Latin 普莱斯姆、泰库、尼姆巴斯、莱特豪斯、洛德斯塔 → Prysm、Teku、Nimbus、Lighthouse、Lodestar
glossary.json execution-client-definition Same pattern 贝苏、埃里贡、奈瑟曼德 → Besu、Erigon、Nethermind

zh-tw (4 critical) — same policy ambiguity

File Key Issue Current → Expected
glossary.json consensus-client-definition Brand names transliterated 普萊斯姆、泰庫、寧布斯、萊特豪斯、洛德斯塔 → Prysm、Teku、Nimbus、Lighthouse、Lodestar
glossary.json execution-client-definition Brand names transliterated 貝蘇、艾瑞貢、奈瑟邁 → Besu、Erigon、Nethermind
glossary.json liquidity-definition Uniswap inconsistently rendered (other entry in same file uses Latin "Uniswap") 尤尼斯瓦普 (Uniswap) → Uniswap
learn-quizzes.json defi-3-a-label, defi-3-a-explanation, staking-solo-4-d-label Same Uniswap rendering 尤尼斯瓦普 → Uniswap

Note on zh / zh-tw client transliterations: The agents flagged these as critical, but the transliterations/zh.json and transliterations/zh-tw.json banks themselves contain the phonetic forms that were used. So this is a project-policy question, not a regression introduced by this PR: should the Chinese transliteration banks list software product names at all, or should the pipeline skip transliteration for those brands? Worth a maintainer decision before any auto-fix.

follow-up to PR review by Claude Code (24 parallel sub-agents, model: sonnet)

@claude
Copy link
Copy Markdown
Contributor

claude Bot commented May 7, 2026

Translation Quality Review — Detailed Scores (1/2: ar–ko)

Follow-up to the PR review at d83b890d62. The headline summary table is in the review; this comment expands per-language quality breakdown for ar–ko (mr–zh-tw in the next comment).

ar — 8.4/10
Category Score
Brand Name Preservation 10/10
Technical Accuracy 8/10
Semantic Fidelity 9/10
Terminology Consistency 6/10
Tone/Register 9/10

Substantially improved over PR #17105 (5.2/10). No oracle-as-fortune-teller, POAP-as-Consumer-Protection-Office, state-as-nation-state, or block-as-barrier corruptions. No cross-script contamination. Only actionable issue is staking="التخزين" (storage) where the documented canonical Arabic term is "التحصيص"; internally consistent across both files but conflicts with the established standard.

bn — 8.5/10
Category Score
Brand Name Preservation 9.5/10
Technical Accuracy 8/10
Semantic Fidelity 8.5/10
Terminology Consistency 8/10
Tone/Register 9/10

High overall quality; brand transliterations match the bank. The two fixes are: WETH ticker lowercased in wrapped-token-definition, and "resolvers" mistranslated as "রিভলভারগুলোর" (revolvers) in ens-definition. Cross-file inconsistency: "(l2)" parenthetical added to learn-quizzes only.

cs — 8.6/10
Category Score
Brand Name Preservation 9/10
Technical Accuracy 9/10
Semantic Fidelity 9/10
Terminology Consistency 7/10
Tone/Register 9/10

Solid quality. Critical issue: assert-term rendered as "asert" — code keyword should stay literal. Recurring "gas" vs "plyn" mixing across compound terms and quiz items; settle on one convention (industry preference: keep "gas").

de — 8.6/10
Category Score
Brand Name Preservation 10/10
Technical Accuracy 9/10
Semantic Fidelity 9/10
Terminology Consistency 8/10
Tone/Register 7/10

No brand-name violations or semantic inversions. Main issue: systematic Sie/du register mixing in learn-quizzes.json — UI chrome and several sections use informal "du" while wallets, security, defi, stablecoins, and daos sections use formal "Sie". Should be unified to one register (preferably "du" to match the surrounding UI). Plus one Bitcoin/bitcoin capitalization slip.

es — 8.6/10
Category Score
Brand Name Preservation 10/10
Technical Accuracy 9/10
Semantic Fidelity 9/10
Terminology Consistency 8/10
Tone/Register 7/10

No brand violations or inversions. Two critical fixes: redundant DeSci acronym and "Wei" capitalization. Notable warning: tú/usted register mixing in learn-quizzes — wallets/security sections use formal "usted" while run-a-node/nfts use informal "tú".

fr — 8.4/10
Category Score
Brand Name Preservation 10/10
Technical Accuracy 7/10
Semantic Fidelity 8/10
Terminology Consistency 8/10
Tone/Register 9/10

Two real translation errors: gas-5-d-label collapses "tip" into "frais de priorité" (duplicates the priority-fee term) and account-definition describes EOA as "compte détenu par un tiers" (third-party-owned) — opposite of intended meaning. "Slashing" → "réduction" (commercial discount) is a recurring ambiguity worth monitoring.

hi — 8.8/10
Category Score
Brand Name Preservation 9/10
Technical Accuracy 8/10
Semantic Fidelity 9/10
Terminology Consistency 9/10
Tone/Register 9/10

Critical: Faroese mislabeled as Farsi (entirely different language) in doge-d-definition. Lowercase (l1)/(l2) parentheticals on layer terms. No domain transliterations, no security issues. Brand transliterations correct per bank.

id — 9.6/10
Category Score
Brand Name Preservation 10/10
Technical Accuracy 9/10
Semantic Fidelity 10/10
Terminology Consistency 9/10
Tone/Register 10/10

No critical issues. Brand names, tickers, hrefs, MDX markup all preserved. Only minor warnings: "Wei" capitalization in denomination definitions, mid-sentence capitalization of "Ketersediaan" in three rollup explanations, "danksharding" lowercased in two scaling entries.

it — 8.4/10
Category Score
Brand Name Preservation 9/10
Technical Accuracy 8/10
Semantic Fidelity 9/10
Terminology Consistency 7/10
Tone/Register 9/10

Critical sanitizer escape leak: ommer-definition contains literal <HTML-PLACEHOLDER-HTMLTAG-7ff424>...</HTML-PLACEHOLDER-HTMLTAG-7ff424> instead of an <a href> anchor — will render as visible broken text. Also: redundant Italian article "il The Merge" in 6 of ~16 occurrences; "Gwei"/"Wei" capitalization where English uses lowercase; mixed "Prova di lavoro" / "Proof-of-Work" inline.

ja — 9.4/10
Category Score
Brand Name Preservation 10/10
Technical Accuracy 9/10
Semantic Fidelity 9/10
Terminology Consistency 9/10
Tone/Register 10/10

No critical issues. learn-quizzes.json is fully clean. Minor glossary issues only: apr-definition parenthetical "年換算利回り" (annualized yield/APY) doesn't match APR's borrowing-cost meaning; validator-lifecycle-term uses "バリデーター" while 24 other occurrences use "バリデータ"; Teku transliteration differs from bank.

ko — 8.2/10
Category Score
Brand Name Preservation 9/10
Technical Accuracy 7/10
Semantic Fidelity 8/10
Terminology Consistency 8/10
Tone/Register 9/10

Two notable correctness errors: fraction inversion "3/2" (impossible — > 1) in consensus-definition, and word truncation "블록체" → should be "블록체인" in bridge-definition. Otherwise high-quality.

follow-up 1/2 to PR review by Claude Code

@claude
Copy link
Copy Markdown
Contributor

claude Bot commented May 7, 2026

Translation Quality Review — Detailed Scores (2/2: mr–zh-tw)

Continuation of detailed per-language scores. Part 1 (ar–ko) is in the previous comment.

mr — 9.6/10
Category Score
Brand Name Preservation 10/10
Technical Accuracy 9/10
Semantic Fidelity 10/10
Terminology Consistency 9/10
Tone/Register 10/10

No critical issues. Brand transliterations to Devanagari correct, no domain transliterations, no inversions. Minor: Devanagari numeral "२" appears in two scaling labels where Western numerals are used elsewhere; "Proof-of-Authority" semantically translated while PoS/PoW are kept as transliterations.

pl — 9.0/10
Category Score
Brand Name Preservation 10/10
Technical Accuracy 9/10
Semantic Fidelity 9/10
Terminology Consistency 8/10
Tone/Register 9/10

Critical: mint-term uses infinitive verb "Wybijać" instead of gerund "Wybijanie" for the term heading. Warnings: scaling-2 label register inconsistency (3rd-person plural vs imperative within the same question), "Sekwenser"/"sekwencer" cross-file inconsistency.

pt-br — 9.2/10
Category Score
Brand Name Preservation 10/10
Technical Accuracy 9/10
Semantic Fidelity 10/10
Terminology Consistency 8/10
Tone/Register 9/10

No critical issues. Main warning: terminology collision — both "Encryption" and "Cryptography" render as "Criptografia". Minor "wei"/"Wei" and "gas"/"gás" casing/accent inconsistencies.

ru — 8.8/10
Category Score
Brand Name Preservation 8/10
Technical Accuracy 9/10
Semantic Fidelity 9/10
Terminology Consistency 8/10
Tone/Register 10/10

Critical: Nethermind transliterated as "Незермайнд" while bank specifies "Недермайнд". Warnings: Lodestar similarly off-bank; "permissionless" → "Общедоступный" (publicly accessible) blurs the no-permission meaning; Web1/Web3 stay Latin but Web2 rendered as "Веб2" within the same sentences.

sw — 8.8/10
Category Score
Brand Name Preservation 10/10
Technical Accuracy 9/10
Semantic Fidelity 9/10
Terminology Consistency 7/10
Tone/Register 9/10

No critical issues. Spelling inconsistencies: "tokeni" appears as "tokani" / "tokini" in two keys; "fork" alternates between "mchepuo" and "mchepuko"; "False" rendered both as "Uongo" and "Si kweli". frontier-term translated as "Eneo Jipya" rather than kept as the Ethereum dev-stage code name "Frontier".

ta — 8.4/10
Category Score
Transliteration accuracy 8/10
Ticker preservation 10/10
Domain preservation 10/10
HTML/MDX integrity 10/10
Cross-script 10/10
Inversion check 10/10
Terminology consistency 7/10

No critical issues. Three different "Ethereum" transliterations across the file ("எத்திரியம்", "எத்தேரியம்", "எத்தீரியம்") — bank designates "எத்தேரியம்" as primary; should be normalized. geth-term expanded beyond source label.

te — 8.4/10
Category Score
Brand Name Preservation 9/10
Technical Accuracy 7/10
Semantic Fidelity 9/10
Terminology Consistency 8/10
Tone/Register 9/10

Critical: assert-term semantically translated as "నిర్ధారించు" (confirm/verify) — Solidity reserved keyword should stay literal or be transliterated. Otherwise complete coverage, no cross-script issues, no inversions.

tr — 9.0/10
Category Score
Brand Name Preservation 10/10
Technical Accuracy 8/10
Semantic Fidelity 9/10
Terminology Consistency 9/10
Tone/Register 9/10

Strong improvement over PR #17182 (7.7/10). The prior failure modes (Solidity→katillik, client→Müşteri, EHT/BSL transpositions) are absent. One critical: block-reward-definition truncated to "mikt." (should be "miktarı."). Stylistic note: "sabitcoin" used consistently where the project guideline is "sabit para".

uk — 8.8/10
Category Score
Brand Name Preservation 9/10
Technical Accuracy 9/10
Semantic Fidelity 9/10
Terminology Consistency 8/10
Tone/Register 9/10

No critical issues. learn-quizzes.json is clean. Glossary has the same off-bank Nethermind ("Незермайнд" vs bank "Недермайнд") and Lodestar ("Лодстар" vs "Лоудстар") transliterations as Russian. Lowercase (l2) parenthetical added in two definitions.

ur — 9.2/10
Category Score
Brand Name Preservation 10/10
Technical Accuracy 9/10
Semantic Fidelity 9/10
Terminology Consistency 9/10
Tone/Register 9/10

No critical issues. learn-quizzes.json is clean. No Hijri calendar usage, no domain transliterations, no ERC-۲۰-style numeral corruption of technical identifiers. Minor mixed-numeral warning on 51%-attack-term (Urdu "۵۱٪" in label, Western numerals elsewhere in the entry).

vi — 8.0/10
Category Score
Brand Name Preservation 7/10
Technical Accuracy 8/10
Semantic Fidelity 9/10
Terminology Consistency 7/10
Tone/Register 9/10

The PR #17176 untranslated-content issue is resolved — full coverage. Two new criticals: "The Ethereum Foundation" translated as "Tổ chức Ethereum" (other vi files preserve the English name); "Oracle" calqued as "Nguồn cấp dữ liệu" (data feed). Also: gas-5-d-label loses pedagogical distinction by collapsing both "priority fee" and "tip" to "phí ưu tiên".

zh — 8.4/10
Category Score
Brand Name Preservation 6/10
Technical Accuracy 9/10
Semantic Fidelity 9/10
Terminology Consistency 9/10
Tone/Register 9/10

Core terminology (权益证明, 工作量证明, 验证者, 矿工, 状态, 智能合约) is correct semantic-calque throughout. Critical/policy: client software brands (Prysm/Teku/Nimbus/Lighthouse/Lodestar/Besu/Erigon/Nethermind) phonetically transliterated rather than kept Latin — but the transliteration bank itself contains those forms, so this is a project-policy question. Also: retry key adds "率:" (rate + colon) not present in English source.

zh-tw — 8.2/10
Category Score
Brand Name Preservation 6/10
Technical Accuracy 9/10
Semantic Fidelity 9/10
Terminology Consistency 8/10
Tone/Register 9/10

Same client-name policy issue as zh (4 keys including Uniswap inconsistency). No Simplified-character leakage. casper-ffg-term expanded with full calque "Casper 友善最終性小工具" rather than keeping the acronym; homestead-term transliterated while frontier-term calqued — same category should be treated uniformly.

follow-up 2/2 to PR review by Claude Code

@ajarcoronet5415-commits

This comment was marked as spam.

@wackerow
Copy link
Copy Markdown
Member Author

Cerberus, block ajarcoronet5415-commits

myelinated-wackerow and others added 2 commits May 12, 2026 06:56
Hand-fixes for review-flagged criticals on this pipeline run, verified against ETHGlossary canonical where an entry exists.

- bn/ens-definition: resolver spelling (resolver vs revolver)
- de/learn-quizzes what-is-ethereum-2-d: lowercase second "Bitcoin" to match the in-text "kleines b" aside
- es/glossary desci-definition: remove redundant (DeSci) parenthetical
- es/glossary wei/finney/gwei/szabo: lowercase inline wei per canonical prose form
- fr/glossary account-definition: "compte externe" matches eoa-term in same file
- fr/learn-quizzes gas-5-d-label: tip = pourboire, not duplicate priority fee
- hi/glossary doge-d-definition: Faroese, not Farsi
- it/glossary ommer-definition: restore sanitizer placeholder to <a href="/glossary/#pow">...</a>
- ko/glossary consensus-definition: 2/3 not 3/2
- ko/glossary bridge-definition: complete truncated word to blockchain
- tr/glossary block-reward-definition: complete truncated word to miktari
- vi/learn-quizzes what-is-ethereum-3-c: brand name "The Ethereum Foundation"

Skipped six review-flagged items where ETHGlossary canonical confirms the existing translation (ar/staking, cs/assert, te/assert, pl/mint, hi/layer-1, hi/layer-2, vi/oracle). Reverted three pre-applied edits where ETHGlossary canonical contradicted the reviewer (ru/Nethermind, zh-tw/Uniswap x4, bn/weth). zh/zh-tw client-name transliteration is a project policy question, deferred.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Co-Authored-By: wackerow <54227730+wackerow@users.noreply.github.com>
Copy link
Copy Markdown
Collaborator

@myelinated-wackerow myelinated-wackerow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Translation Quality Review — Approval (Updated)

Follow-up to the prior review at d83b890d62. Critical issues resolved at 8ccbb26965.

After re-verifying every flagged item against ETHGlossary canonical via /api/v1/translations/{lang}/{term}, the prior critical count of 28 resolves to:

  • 12 fixed in commit 8ccbb26965
  • 16 confirmed canonical-conformant — translations match the high-confidence ETHGlossary entries; the prior "critical" classification was a false positive

Updated summary by language

Language Pre-fix Post-fix Critical Warnings
ar 8.4 9.0 0 4
bn 8.5 9.0 0 4
cs 8.6 9.2 0 4
de 8.6 9.0 0 2
es 8.6 9.2 0 4
fr 8.4 9.0 0 2
hi 8.8 9.2 0 9
id 9.6 9.6 0 3
it 8.4 9.4 0 4
ja 9.4 9.4 0 3
ko 8.2 9.2 0 3
mr 9.6 9.6 0 4
pl 9.0 9.4 0 4
pt-br 9.2 9.2 0 4
ru 8.8 9.4 0 4
sw 8.8 8.8 0 5
ta 8.4 8.4 0 4
te 8.4 9.0 0 4
tr 9.0 9.4 0 2
uk 8.8 8.8 0 3
ur 9.2 9.2 0 4
vi 8.0 8.8 0 4
zh 8.4 8.8 0 4
zh-tw 8.2 9.0 0 3
Total 8.7 9.1 0 91

What changed

Fixed in 8ccbb26965:

  • bn/glossary.json:ens-definitionরিভলভারগুলোর (revolvers) → রিজলভারগুলোর (resolvers)
  • de/learn-quizzes.json:what-is-ethereum-2-d-explanation — second Bitcoinbitcoin to match the in-text "kleines b" aside
  • es/glossary.json:desci-definition — removed redundant (DeSci) parenthetical
  • es/glossary.json — 4× inline Weiwei per ETHGlossary canonical prose form (wei/finney/gwei/szabo)
  • fr/glossary.json:account-definitioncompte détenu par un tierscompte externe (semantic correction; matches eoa-term in the same file)
  • fr/learn-quizzes.json:gas-5-d-label — duplicate frais de prioritépourboire (restores the English distractor intent)
  • hi/glossary.json:doge-d-definitionफ़ारसी (Farsi) → फ़ैरोज़ (Faroese)
  • it/glossary.json:ommer-definition — restored sanitizer placeholder <HTML-PLACEHOLDER-HTMLTAG-7ff424> to <a href="/glossary/#pow"> (would have rendered visible broken text in production)
  • ko/glossary.json:consensus-definition3/22/3 (math fact)
  • ko/glossary.json:bridge-definition블록체블록체인 (truncation)
  • tr/glossary.json:block-reward-definitionmikt.miktarı. (truncation)
  • vi/learn-quizzes.json:what-is-ethereum-3-c-{label,explanation}Tổ chức EthereumThe Ethereum Foundation (brand name)

Canonical-conformant (no change needed):

Item ETHGlossary canonical Confidence
ar/staking التخزين high
bn/weth র্যাপড ইথার (weth) — lowercase high
cs/assert asert medium (English form acknowledged in notes)
hi/layer-1, hi/layer-2 लेयर 1 (l1) / लेयर 2 (l2) high
pl/mint wybijać high
ru/Nethermind Незермайнд high
te/assert నిర్ధారించు high
vi/oracle nguồn cấp dữ liệu high (notes defend it)
zh-tw/Uniswap 尤尼斯瓦普 high (notes acknowledge English prevalence)
zh/zh-tw/consensus-client, execution-client Phonetic transliterations high (match transliteration banks)

Notes

  • The sanitizer placeholder leak in it/glossary.json was the only production-blocking item — it would have rendered literal <HTML-PLACEHOLDER-HTMLTAG-7ff424>...</HTML-PLACEHOLDER-HTMLTAG-7ff424> on the live site. Worth a tracking issue against the sanitizer's HTML preserve/restore step.
  • 91 warnings from the original review remain unaddressed — non-blocking, can ship and follow up.
  • zh/zh-tw client-software brand names use phonetic forms per the canonical and transliteration banks. If project policy ever favors keeping Latin for software brands without an official Chinese rendering, that's a coordinated update to the canonical + bank — not a regression here.

Reviewed by Claude Code (Opus 4.7), canonical verification against ETHGlossary /api/v1/translations/{lang}/{term}

@wackerow wackerow merged commit 14e54f3 into dev May 12, 2026
9 of 10 checks passed
@wackerow wackerow deleted the intl/pending-dev branch May 12, 2026 14:29
@wackerow
Copy link
Copy Markdown
Member Author

Cerberus, block ajarcoronet5415-commits

@cerberus-robot
Copy link
Copy Markdown

⚠️ User ajarcoronet5415-commits was not found in Cerberus database.
ℹ️ Blocking GitHub user @ajarcoronet5415-commits
☑️ User blocked

@pettinarip pettinarip mentioned this pull request May 15, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

translation 🌍 This is related to our Translation Program

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants