-
Notifications
You must be signed in to change notification settings - Fork 183
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Stroke in 娩 is attributed as 'element' rather than 'type' ; Stroke in 禺 and its descendants declared as wrong type #199
Comments
On Mon, 28 Feb 2022 at 10:21, Valentín Peña-Donaire < ***@***.***> wrote:
I've downloaded the database kanjivg-20160426-main.zip
<https://github.com/KanjiVG/kanjivg/releases/download/r20160426/kanjivg-20160426-main.zip>
and dived deep into the XML hierarchy of grouped SVG paths that represent
radicals, sub-radicals, etc.
In almost all kanji, the bottom XML level is a single stroke and contains
the attribute 'type' but not 'element', as described in the documentation
<https://kanjivg.tagaini.net/format.html> , whereas middle-level groups
contain the attribute 'element' but not 'type'. This is not true for 娩
(U+05A29), where the second-to-last stroke is reported as the element '丿'
(CJK Unified ideograph U+4E3F) rather than the type '㇒' (U+31D2, as it is
done in the entry for kanji 免 (U+514D)), or '㇓' (the not included CJK
stroke U+31D3). AFAIK, this is the only kanji in kanjivg-20160426-main.zip
where this happens.
According to IDS (https://github.com/cjkvi/cjkvi-ids) the correct
decomposition of the right side is
U+514D 免 ⿱𠂊⑤[GTK] ⿳𠂊𫩏儿[J]
where the Japanese usually uses the "leg" part. This is in the github repo
of kanjivg as 05a29-Hyougai.svg.
But this seems to be an error in kanjivg where the leg element should have
been used.
Also, the 'type' attribute of the eighth stroke of 禺 (U+79BA) is set to
'6' (U+FF16). I'm no expert in Japanese calligraphy, but I would say it
looks similar to the eight stroke of 球 (U+7403), i.e. '㇀' (U+31C0).
Moreover, both type '6' and '㇀' are different from type '㇐' (U+31D0) that
appears in the radical 114, viz. 禸 (U+79B8). This problem is passed down to
some kanji 禺 acts as a group in, viz. 勵 (U+52F5), 寓 (U+5BD3), 嵎 (U+5D4E), 癘
(U+7658), 礪 (U+792A), 糲 (U+7CF2), 萬 (U+842C), 藕 (U+85D5), 蠣 (U+8823), and 邁
(U+9801).
I'm currently examining this and I will respond tomorrow.
Message ID: ***@***.***>
… |
From grepping the source files, kvg:type="6" only seems to be used in 禺 and characters derived from it, so presumably it's a "special case". The nature and origin of this special case is not documented in any repository file, but since it's only used here, it doesn't seem to be a particular problem. Incidentally I could not find other numbers used in this way. |
This particular issue seems to have been fixed in which was a response to To save this bug from being reported again, another release of the data files should be made. I don't know how to do that yet but I'll look into it. |
I've made a new release of the files, so I think this can be closed now. Please open it again if not satisfied. I've also added a note about the 6 into the documentation pages, but hopefully that can be replaced with a better stroke type soon. |
I've downloaded the database kanjivg-20160426-main.zip and dived deep into the XML hierarchy of grouped SVG paths that represent radicals, sub-radicals, etc.
In almost all kanji, the bottom XML level is a single stroke and contains the attribute 'type' but not 'element', as described in the documentation , whereas middle-level groups contain the attribute 'element' but not 'type'. This is not true for 娩 (U+05A29), where the second-to-last stroke is reported as the element '丿' (CJK Unified ideograph U+4E3F) rather than the type '㇒' (U+31D2, as it is done in the entry for kanji 免 (U+514D)), or '㇓' (the not included CJK stroke U+31D3). AFAIK, this is the only kanji in kanjivg-20160426-main.zip where this happens.
Also, the 'type' attribute of the eighth stroke of 禺 (U+79BA) is set to '6' (U+FF16). I'm no expert in Japanese calligraphy, but I would say it looks similar to the eight stroke of 球 (U+7403), i.e. '㇀' (U+31C0). Moreover, both type '6' and '㇀' are different from type '㇐' (U+31D0) that appears in the radical 114, viz. 禸 (U+79B8). This problem is passed down to some kanji 禺 acts as a group in, viz. 勵 (U+52F5), 寓 (U+5BD3), 嵎 (U+5D4E), 癘 (U+7658), 礪 (U+792A), 糲 (U+7CF2), 萬 (U+842C), 藕 (U+85D5), 蠣 (U+8823), and 邁 (U+9801).
The text was updated successfully, but these errors were encountered: