i18n: translation pipeline (all languages)#18188
Conversation
Co-Authored-By: Gemini <gemini@google.com>
Co-Authored-By: Gemini <gemini@google.com>
Co-Authored-By: Gemini <gemini@google.com>
Co-Authored-By: Gemini <gemini@google.com>
Co-Authored-By: Gemini <gemini@google.com>
Co-Authored-By: Gemini <gemini@google.com>
Co-Authored-By: Gemini <gemini@google.com>
Co-Authored-By: Gemini <gemini@google.com>
Co-Authored-By: Gemini <gemini@google.com>
Co-Authored-By: Gemini <gemini@google.com>
Co-Authored-By: Gemini <gemini@google.com>
Co-Authored-By: Gemini <gemini@google.com>
Co-Authored-By: Gemini <gemini@google.com>
Co-Authored-By: Gemini <gemini@google.com>
Co-Authored-By: Gemini <gemini@google.com>
Co-Authored-By: Gemini <gemini@google.com>
Co-Authored-By: Gemini <gemini@google.com>
Co-Authored-By: Gemini <gemini@google.com>
Co-Authored-By: Gemini <gemini@google.com>
Co-Authored-By: Gemini <gemini@google.com>
Co-Authored-By: Gemini <gemini@google.com>
Co-Authored-By: Gemini <gemini@google.com>
Co-Authored-By: Gemini <gemini@google.com>
Co-Authored-By: Gemini <gemini@google.com>
✅ Deploy Preview for ethereumorg ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
|
🌐 Translation review started. View progress |
There was a problem hiding this comment.
Translation Quality Review
PR: 18188
Branch HEAD: 2a67b375ad
Languages: ar, bn, cs, de, es, fr, hi, id, it, ja, ko, mr, pl, pt-br, ru, sw, ta, te, tr, uk, ur, vi, zh, zh-tw (24 languages)
Files reviewed: 24 (one file per language: developers/tutorials/clear-signing/index.md)
Date: 2026-05-13
Fixes: No fixes applied (review-only)
No prior LLM review exists on this PR, so the full PR diff was reviewed. Each language was assessed by a parallel sub-agent against the file English source. The ETHGlossary API endpoint was unreachable from the CI sandbox for all 24 agents; terminology was verified manually against .claude/translation-review/known-patterns.md and standard per-language Ethereum vocabulary instead of a live glossary lookup.
(NOTE: Two earlier reviews on this PR labeled "Test review" and "Test via gh api" were accidental submissions from this same CI run while debugging body-submission plumbing — please disregard them; this is the canonical review.)
Summary
| Language | Files | Quality Score | Issues |
|---|---|---|---|
| ar | 1 | 8.8/10 | 0 critical, 6 warnings |
| bn | 1 | 8.8/10 | 0 critical, 4 warnings |
| cs | 1 | 9.8/10 | 0 critical, 1 warning |
| de | 1 | 9.5/10 | 0 critical, 7 warnings |
| es | 1 | 9.7/10 | 0 critical, 4 warnings |
| fr | 1 | 9.8/10 | 0 critical, 0 warnings |
| hi | 1 | 8.4/10 | 0 critical, 6 warnings |
| id | 1 | 9.8/10 | 0 critical, 2 warnings |
| it | 1 | 9.7/10 | 0 critical, 3 warnings |
| ja | 1 | 8.8/10 | 0 critical, 9 warnings |
| ko | 1 | 8.2/10 | 1 critical, 7 warnings |
| mr | 1 | 9.2/10 | 0 critical, 3 warnings |
| pl | 1 | 9.6/10 | 0 critical, 3 warnings |
| pt-br | 1 | 9.2/10 | 0 critical, 3 warnings |
| ru | 1 | 8.8/10 | 0 critical, 5 warnings |
| sw | 1 | 8.8/10 | 0 critical, 5 warnings |
| ta | 1 | 8.4/10 | 0 critical, 4 warnings |
| te | 1 | 8.9/10 | 0 critical, 3 warnings |
| tr | 1 | 9.6/10 | 0 critical, 2 warnings |
| uk | 1 | 9.4/10 | 0 critical, 5 warnings |
| ur | 1 | 9.2/10 | 0 critical, 3 warnings |
| vi | 1 | 9.6/10 | 0 critical, 3 warnings |
| zh | 1 | 8.4/10 | 0 critical, 6 warnings |
| zh-tw | 1 | 9.2/10 | 0 critical, 4 warnings |
Overall: 1 critical issue, ~94 warnings across 24 files. Structurally the import is clean — MDX syntax, code blocks, JSON keys/values, function signatures, anchor IDs, and the internal /foundation/ href are preserved correctly across all 24 languages. Known failure modes (Igbo contamination, Solidity → "الصلابة", GitHub → "يجتبه", state → nation-state, MEV → vehicles, transliterated .org domains) are all avoided.
Critical issue (must fix before merge)
| Language | File | Line | Issue | Current | Expected |
|---|---|---|---|---|---|
| ko | public/content/translations/ko/developers/tutorials/clear-signing/index.md |
267 | Truncated word: "Registry maintainers" rendered with truncated "레지스트" instead of "레지스트리" (registry) | 레지스트 유지 관리자 |
레지스트리 유지 관리자 |
Common warning patterns (multi-language)
These themes recurred across many languages and are worth a single editorial pass rather than per-language tickets:
- Inconsistent brand-name transliteration across non-Latin scripts (ar, bn, hi, ja, ko, ru, ta, te, uk, zh, zh-tw): "Ethereum" / "Uniswap" / "Ethereum Foundation" transliterated into the target script, while "Solidity" / "Python" / "GitHub" / "Sourcify" remain in Latin. Pick one convention per locale.
- Phonetic "Uniswap" transliteration in zh/zh-tw: "尤尼斯瓦普" is non-standard in Chinese Ethereum content; "Uniswap" is conventionally left in Latin.
- Author name transliteration (hi, ja, ko, mr, ur, zh, zh-tw): "Hester Bruikman" transliterated into the target script. Author bylines are usually kept in original Latin.
- "Ethereum" spelled two different ways within the same document (hi: इथेरियम / एथेरियम; ru: Эфириум / Ethereum; ta: எத்திரியம் / எத்தீரியம்; uk: Етеріум / Ethereum). Pick one form per locale.
- Word-choice concerns in a few locales: zh "明文签名" implies plaintext signing rather than clear signing; zh line 179 "代码元数据" mistranslates ticker as "code"; sw "Uainishaji" for specification leans toward "classification"; pt-br conflates "implementation" / "deployment" as "implantação"; te line 12 "compromised" → "రాజీ పడటం" (negotiation-sense).
- Missing trailing newline at EOF in ~6 locales (cosmetic only).
- Frontmatter
tagsarrays translate concept tags (e.g., "security" → "seguridad"). This is the expected behavior per.claude/translation-review/known-patterns.mdparagraph 10 — only brand-name tags (e.g.,ERC-7730) must stay English. Listed only because some agents flagged it; no action needed.
Highlights of clean translations
- fr (9.8/10): No warnings, no critical issues; faithful "vous" register.
- cs (9.8/10): Single soft warning; otherwise pristine.
- id (9.8/10): Helpful bilingual glosses for jargon (calldata, blind signing, fork, pull request).
- es / it (9.7/10): Clean structural integrity, well-localized number formatting.
- vi (9.6/10): The historical vi "untranslated English paragraphs" pattern is not present in this file — major improvement.
Sandbox limitation
Every agent reported that calls to the ETHGlossary API (https://ethglossary.visual-20-hoists.workers.dev/api/v1/filter) were blocked by the CI sandbox even with dangerouslyDisableSandbox: true. Glossary compliance was therefore assessed against known-patterns conventions and the agents baseline knowledge rather than a live fetch. A local re-run with glossary access could surface additional fine-grained terminology issues. Detailed per-language category breakdowns are posted as a follow-up issue comment.
To apply fixes, run locally:
/review-translations --pr=18188 --fix
Reviewed by Claude Code (CI workflow)
|
Detailed per-language category breakdowns — part 1/2 (follow-up to the Translation Quality Review) Detail dump for Translation Quality Review at ar — 8.8/10
Known ar bugs (Solidity → الصلابة, GitHub → يجتبه, state → الدولة, MEV → vehicles) all avoided. Main note: Solidity / Sourcify / Python / GitHub kept in Latin rather than transliterated into Arabic script; "Uniswap V3" / "Ethereum" correctly transliterated. bn — 8.8/10
Mixed brand-transliteration policy. Awkward "র (raw)" rendering on lines 12, 27. Western Arabic numerals used (correct). cs — 9.8/10
Pristine. Helpful English-in-parens glosses for "blind signing", "calldata", "fork", "merge". de — 9.5/10
Consistent "Sie" register. Minor: "schlimmstenfalls" softens "and worse" (line 12); "Aufrufdaten" for calldata is non-canonical but valid. es — 9.7/10
Consistent informal "tú". Illustrative wallet display localized ("Intercambio") while JSON examples keep "Swap" English — correct distinction. fr — 9.8/10
No warnings. "vous" register consistent; idiomatic French throughout. ETHGlossary-aligned: contrat intelligent, portefeuille, signature en clair, descripteur, jeton, données dappel. hi — 8.4/10
"Ethereum" inconsistent (इथेरियम / एथेरियम). Solidity / Python / GitHub / Sourcify left Latin while वॉलेट / यूनिस्वैप transliterated. "कारनामों" for "exploits" is mildly soft. Devanagari numerals NOT used (correct). id — 9.8/10
Formal "Anda" throughout. Bilingual glosses used effectively. it — 9.7/10
Italian number conventions applied to display blocks (1.000 / 0,42). Code blocks retain English. ja — 8.8/10
Consistent です/ます polite form. Ethereum / Uniswap correctly transliterated to Katakana; Solidity / Python / GitHub / Sourcify left Latin (inconsistent but common practice in Japanese tech writing). ko — 8.2/10 (has the critical fix)
One critical typo on line 267: "레지스트" should be "레지스트리" (registry). Line 25 phrasing awkward; brand-name transliteration partial. (continued in part 2) |
|
Detailed per-language category breakdowns — part 2/2 (follow-up to the Translation Quality Review) Continued from part 1. Part 2 covers mr through zh-tw. mr — 9.2/10
Careful use of LRI/PDI bidi isolates around pl — 9.6/10
"GitHubie" — Polish declensional suffix on English root, idiomatic. Modern Polish orthography ("projekt"). pt-br — 9.2/10
Lines 87, 105: "implementation" rendered as "implantação" (deployment) — meaning recoverable from context but should be "implementação". ru — 8.8/10
"Ethereum" appears as Cyrillic "Эфириум" on line 12 and Latin "Ethereum" on line 259. "Юнисвоп" for Uniswap is unconventional. Russian thousands separator (1 000) not applied in example blocks. sw — 8.8/10
"Ethereum Foundation" → "Taasisi ya Ethereum" partially translated. "Uainishaji" for specification leans toward "classification". ta — 8.4/10
"Ethereum" inconsistent (எத்திரியம் / எத்தீரியம்). Unusual "வில்லை" for "token" — "டோக்கன்" is more common in Tamil crypto content. te — 8.9/10
Line 12: "compromised" → "రాజీ పడటం" (negotiation-sense, not security-sense). Line 168: "richer" → "ధనిక" (wealthy-sense). tr — 9.6/10
Historical Turkish failure modes (Solidity → katillik, mainnet → Markette, ETH → EHT, BLS → BSL) all absent. Turkish number formatting (1.000 / 0,42) correctly applied. uk — 9.4/10
Modern Ukrainian orthography ("проєкт") — no Russianisms detected. "Ethereum" mixed Cyrillic / Latin. ur — 9.2/10
vi — 9.6/10
Historical "untranslated English paragraphs" pattern not present here. Line 289: stray capital "Quản trị" mid-sentence. zh — 8.4/10
Line 179: "ticker metadata" rendered as "代码元数据" (code metadata) — semantic miss; should be "代号" / "代币代号". Line 21: "尤尼斯瓦普 (Uniswap)" phonetic transliteration is non-standard. "明文签名" for "clear signing" implies "plaintext signing" — consider "清晰签名". zh-tw — 9.2/10
No Simplified-Chinese-only characters detected. Polite 您 register consistent. Same phonetic "Uniswap" transliteration concern as zh. Detail dumps posted as issue comments (not PR reviews) so they do not carry their own |
Hand-fix four translation errors caught in /review-translations on PR #18188. English source unchanged so the pipeline manifest mapping stays valid (per project CLAUDE.md guidance). - ko line 267: add missing syllable in registry maintainers - pt-br lines 87, 105: implementacao vs implantacao on proxy implementation address/name fields - sw line 291: trilioni spelling correction Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> Co-Authored-By: wackerow <54227730+wackerow@users.noreply.github.com>
myelinated-wackerow
left a comment
There was a problem hiding this comment.
Translation Quality Review
PR: #18188
Branch HEAD: 69e1a0ecc2
Languages: ar, bn, cs, de, es, fr, hi, id, it, ja, ko, mr, pl, pt-br, ru, sw, ta, te, tr, uk, ur, vi, zh, zh-tw (24)
File reviewed: public/content/developers/tutorials/clear-signing/index.md (ERC-7730 tutorial, 292 lines)
Date: 2026-05-13
Fixes: Critical fixes applied: 4 (resolved on this branch in commit 69e1a0ecc2)
Summary
All 24 translations pass the mechanical checks: 14 heading IDs preserved character-for-character, internal href /foundation/ and all external hrefs untouched, every JSON code-fence body unchanged, all tickers (ERC-7730, ERC-20, ERC-8176, ETH, USDC, WETH) preserved, MDX <Alert> blocks intact, Western Arabic numerals throughout the Indic / Tamil / Telugu / Urdu files, no transliterated domain names, no cross-script contamination. Brand handling follows the documented script-aware policy. Live ETHGlossary lookup succeeded for all 24 languages (30 matched terms per file).
Critical fixes applied in this review (commit 69e1a0ecc2)
| Lang | Location | Before | After | Reason |
|---|---|---|---|---|
| ko | line 267 | 레지스트 유지 관리자 |
레지스트리 유지 관리자 |
Missing 리 syllable; should be "registry maintainers" |
| pt-br | line 87 | endereço de implantação |
endereço de implementação |
Proxy-pattern semantic: implementation address (logic contract) ≠ deployment address (proxy) |
| pt-br | line 105 | nome do contrato ou da implantação |
nome do contrato ou da implementação |
Same as above on contractName |
| sw | line 291 | Trillioni ya Dola |
Trilioni ya Dola |
Swahili numeral spelling (single L) |
Scores
| Language | Score | Critical (post-fix) | Warnings |
|---|---|---|---|
| ar | 9.6/10 | 0 | 5 |
| bn | 9.0/10 | 0 | 6 |
| cs | 9.2/10 | 0 | 10 |
| de | 9.2/10 | 0 | several minor |
| es | 9.6/10 | 0 | 5 |
| fr | 9.2/10 | 0 | 5 |
| hi | 9.2/10 | 0 | 4 |
| id | 9.2/10 | 0 | 12 |
| it | 9.8/10 | 0 | 7 |
| ja | 9.4/10 | 0 | 7 |
| ko | 8.8/10 | 0 (fixed) | 5 |
| mr | 8.6/10 | 0 | 11 |
| pl | 9.0/10 | 0 (see below) | 6 |
| pt-br | 8.2/10 | 0 (fixed) | 8 |
| ru | 9.6/10 | 0 | 5 |
| sw | 8.0/10 | 0 (fixed) | many |
| ta | 8.6/10 | 0 | 6 |
| te | 8.8/10 | 0 | 7 |
| tr | 8.6/10 | 0 | 6 |
| uk | 9.8/10 | 0 | 5 |
| ur | 9.4/10 | 0 | 5 |
| vi | 8.8/10 | 0 | 7 |
| zh | 9.0/10 | 0 | 5 |
| zh-tw | 8.8/10 | 0 | several |
Items flagged for translator follow-up (not auto-fixed)
These need native-speaker judgment and are not safe to apply mechanically. Logging for the next pipeline pass.
- pl line 12 —
skompromitowanafor "compromised". Polish IT security texts do use this as a calque, but the lay meaning is "discredited/disgraced". On a security warning, the connotation matters. Considerzhakowana,naruszona, orprzejęta. - zh / tr / vi / sw line 179 — "ticker metadata" mistranslated across four languages: zh
代码元数据(= code), trborsa sembolü(= stock-exchange symbol), vimã giao dịch(= transaction code), swtiki. In crypto, "ticker" is the token symbol. - sw / es / ko / vi / zh-tw line 167 — example value translated ("Badilishano", "Intercambio", "스왑", etc.) while the JSON below keeps
"Swap"literal. Defensible as a translated descriptive example but creates a visible mismatch. - zh-tw line 284 —
密碼學證明for "cryptographic attestation". Glossary-compliant but collides with the term for cryptographic proofs (ZK-proof sense). Consider密碼學證言. - Multiple languages line 168 — "provided any display constraints" — slight semantic drift across ja/ta/tr/uk/ur/vi/es. The English source is itself ambiguous; not a clear bug.
- Frontmatter tag translation — non-brand tags (
security,signing,smart contracts,wallets) translated in most non-English files;ERC-7730brand tag correctly stays Latin in all 24. Project-policy question for the tag system, not a translation bug.
Mechanical checks (all 24 languages pass)
- 14/14 heading IDs preserved character-for-character (incl. the EN-source typo-perfect
displayformats-section) - Internal href
/foundation/preserved - All external hrefs unchanged
- All JSON code-fence bodies unchanged (identifiers, keys, string literals, schema paths, addresses)
- All tickers preserved (ERC-7730, ERC-20, ERC-8176, ETH, USDC, WETH, EVM, ABI)
- MDX
<Alert>/<AlertContent>/<AlertDescription>blocks intact - Brand tag
ERC-7730stays Latin in all 24 frontmatters - Western Arabic numerals throughout Devanagari / Tamil / Telugu / Urdu
- No transliterated domain names (no IDN homograph risk)
- No cross-script contamination
Reviewed by Claude Code (Opus 4.7) using parallel per-language sub-agents against the ETHGlossary API and project known-patterns reference.

Automated Translations
This PR contains translations managed by the intl pipeline.
Each run appends a summary below.
Run: 2026-05-13 15:38:19 UTC