Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stroke order issue in 籤, 齏, 殲 #190

Closed
SlugFiller opened this issue Nov 21, 2020 · 21 comments
Closed

Stroke order issue in 籤, 齏, 殲 #190

SlugFiller opened this issue Nov 21, 2020 · 21 comments
Labels
Grouping How individual strokes are grouped Stroke order The order of strokes in the character

Comments

@SlugFiller
Copy link

SlugFiller commented Nov 21, 2020

The second stroke of the left side of the 非 radical in 籤 is actually the first stroke of the right side. That is, the 13th stroke should be the 16th stroke, with strokes 14-16 shifted back accordingly. The tags for the strokes (kvg:type) are actually correct, only the paths are incorrect.

@SlugFiller
Copy link
Author

Same issue in 殲 as well

@benkasminbullock
Copy link
Member

It doesn't contain 非 but 韭.

Zdic.net has the same order as you for this component:

https://www.zdic.net/hans/%E9%9F%AD

It doesn't have a stroke order diagram for the larger component:

https://www.zdic.net/hans/%E9%9F%AF

But the following web site gives the same stroke order as kanjivg:

https://kakijun.jp/page/kuji23200.html
https://kakijun.jp/page/sen21200.html

Similarly here:

https://kaku-navi.com/kanji/kanji07603.html

and here:

https://kanji.quus.net/kakijyun/4601.htm

Can you give a source for your stroke order information?

@SlugFiller
Copy link
Author

@benkasminbullock
The issue in KanjiVG isn't merely the stroke order, but the grouping as well. KanjiVG groups it as 4-4-1, as in, matching the stroke order as it appears on zdic, but has the stroke order from kaku-navi. The result is that the first group contains the 2 lines in the center, and 2 of the 3 lines in the left. It looks clearly broken. If one takes the kaku-navi order as correct, the grouping would have to be 2-3-3-1.

Also, KanjiVG itself marks 韭 as being composed of 非 followed by 一 as part of its group radical metadata. If the stroke order is different, then this metadata would need to be altered, as it would be incorrect.

@benkasminbullock
Copy link
Member

OK, but there is nobody here to do the work except us, so you need to find out the correct stroke order and regroup the SVG elements yourself.

@SlugFiller
Copy link
Author

@benkasminbullock
https://en.wikipedia.org/wiki/Radical_179 puts the stroke order for 韭 as 非 followed by 一. However, it's unclear to what they source the stroke order, as the citations don't specify.

As different sources give different answers, the question becomes "Which source is sufficiently authoritative?" The problem is exacerbated by the fact that the radical is not Jouyou. With no source acting as "word of god", I would even be willing to posit that both answers might be correct.

Unless there's a specific reason to consider it as "wrong", I think 韭 should have its stroke order adjusted to 非 followed by 一, and all kanjis containing it as a radical/component, likewise.

@benkasminbullock
Copy link
Member

benkasminbullock commented Apr 18, 2021 via email

@SlugFiller
Copy link
Author

@benkasminbullock I'm not sure what you mean. This is a GitHub, not a wiki, and I'm not a Member. I can't just straight-up edit the SVGs.

@benkasminbullock
Copy link
Member

You can edit these images without leaving github. Click the pencil icon to fork:

edit-github

Github will make a new image for you:

editing-github

When you have finished editing, click "Pull request".

@benkasminbullock benkasminbullock added the Stroke order The order of strokes in the character label Mar 25, 2022
@benkasminbullock
Copy link
Member

benkasminbullock commented Mar 25, 2022

殱 is another one where this element is in a mess. Strokes 8 and 9 are grouped and labelled with kvg:type as if across strokes but are actually the two downward verticals.

Also 齏, 薤

@benkasminbullock benkasminbullock added the Grouping How individual strokes are grouped label Mar 25, 2022
@benkasminbullock benkasminbullock changed the title Stroke order issue in 籤 Stroke order issue in 籤, 齏, 殲 Mar 25, 2022
@benkasminbullock
Copy link
Member

A complete list of non-variant files affected by this so far:

kanji/05b45.svg
kanji/061f4.svg
kanji/06bb1.svg
kanji/085a4.svg
kanji/097ee.svg
kanji/097f2.svg
kanji/09f4f.svg

@mifunetoshiro
Copy link

I think the order for the following kanji should be as per kakijun.jp:
https://kakijun.jp/page/hi200.html
https://kakijun.jp/page/nira09200.html

非 (罪, 俳, 排, 輩, 匪, 悲, 扉, 斐, 緋, 誹, 徘, 暃, 榧, 琲, 翡, 腓, 菲, 蜚, 裴, 霏, 靠, 鯡, 靡)
韭 (韮, 孅, 懺, 懴, 殲, 殱, 籤, 籖, 纖, 纎, 薤, 讖, 齏, 韲)

The etymology of 非 and 韭 is different and that's why the stroke order differs, even though writing 韭 as 非 followed by 一 would also make sense I guess.

https://hanziyuan.net/#%E9%9D%9E
https://hanziyuan.net/#%E9%9F%AD

@benkasminbullock
Copy link
Member

@ospalh did one of the stroke direction issues here:
067197b

But the grouping was not changed in that commit.

There are several more, all with the same pattern of grouping into left and right, but with the second stroke of the "left" group actually being the vertical of the "right" group, and also the wrong stroke "type" on strokes 1, 2 and 5 of the structure. I'm not sure what to do with the two groups except just delete them, since they are not left and right. The bottom group for the horizontal can be kept, for what it's worth.

All of these broken groups seem to have been added automatically by a script at some point, but it's not in the git history. If only the person who did this had kept the script so that the errors could be fixed at the source.

@SlugFiller
Copy link
Author

If the order is kept as-is, the grouping becomes questionable. Logical options would be 2-3-3-1, (2-3-3)-1, 2-(3-3)-1, (2-(3-3))-1, 8-1, 2-6-1, (2-6)-1, or 9.

@benkasminbullock
Copy link
Member

The existing stroke order seems to be correct, according to the information I can find.

I've kept the single group at the end but ungrouped the eight strokes above into a flat structure, so 8-1 in your terminology. I didn't really want to make more groups out of the remaining strokes, since there is not actually an obvious substructure, and the existing extraneous subgroups in KanjiVG already tend to make the files quite complicated, for what benefit it is very difficult to know. If one was to use 2-3-3 structure then the element value would be something like * 三 三 I suppose.

@benkasminbullock
Copy link
Member

Thank you for the bug report @SlugFiller. I'll close this now, but please reopen if there is anything else, or alternatively start a new issue if that seems like a better fit.

@SlugFiller
Copy link
Author

I'm noticing the element tag 非 is still present for the 8 group in the corrected version. I'm wondering if it's appropriate, given that it is not the same radical. Also noticing the lines on the left went from ㇐c, ㇐c, ㇀ to ㇐c, ㇀, ㇐. It's a mild nitpick, but some dictionaries use this data to facilitate kanji-radical search.

@SlugFiller
Copy link
Author

Also noticed the commit doesn't contain 097ed, which is the root of the issue.

@benkasminbullock
Copy link
Member

I'm noticing the element tag 非 is still present for the 8 group in the corrected version. I'm wondering if it's appropriate, given that it is not the same radical.

kvg:element is used for the visual appearance so it's OK I think.

Also noticing the lines on the left went from ㇐c, ㇐c, ㇀ to ㇐c, ㇀, ㇐. It's a mild nitpick, but some dictionaries use this data to facilitate kanji-radical search.

Which dictionaries use that data?

@benkasminbullock
Copy link
Member

Also noticed the commit doesn't contain 097ed, which is the root of the issue.

You are good at spotting stuff, I hope you contribute more bug reports here.

That was due to a bug in my script which does the correcting where it wasn't looking at the element but only its children when searching for corrections.

@SlugFiller
Copy link
Author

Which dictionaries use that data?

At the very least jisho.org mentions KanjiVG by name, but a check of the radical search results for 非 suggests it doesn't use the radical data, only the strokes.

kvg:element is used for the visual appearance so it's OK I think.

Is it? I was under the impression it, along with kvg:original, indicates actual radical association, or at least indicates an identical stroke order.

You are good at spotting stuff, I hope you contribute more bug reports here.

Only for a while longer. I'm in the process of creating a new kanji visual indexing system that's more intuitive and unambiguous than the likes of SKIP or Four-corner. and has less of a learning curve than the likes of Cangjie. The obvious downside is that the process is mostly manual, as I have to "index" one kanji at a time, and occasionally go back and re-index previous ones if I see a flaw in the method (e.g. having to go back and separate 雷 and 冨 from 雨 and 同 due to 冖 being separate from 冂). Having accurate KanjiVG data is invaluable for creating consistent indices.

I've already indexed ~5500 kanjis of my initial goal of ~6000, so any other reports would most likely either be during the verification pass, or if I decide to expand the target list.

@benkasminbullock
Copy link
Member

benkasminbullock commented Apr 7, 2022

Which dictionaries use that data?

At the very least jisho.org mentions KanjiVG by name, but a check of the radical search results for 非 suggests it doesn't use the radical data, only the strokes.

It's very unlikely that anyone is using kvg:type to do anything at the moment, otherwise the bug reports on this repository would probably number in the thousands. We actually do not have any documentation on what the c or v fields mean, and we only have guesses as to what a and b mean. There are any number of errors in the kvg:type field resulting from some kind of blanket application of a script to generate the files, before it became a git repository. This blanket application by script was done often without any consideration of whether the groups were correct. Yesterday I fixed about 180 cases where almost every kvg:type was wrong due to a repeated error where 糸 was given as five strokes rather than six.

kvg:element is used for the visual appearance so it's OK I think.

Is it? I was under the impression it, along with kvg:original, indicates actual radical association, or at least indicates an identical stroke order.

The documentation which existed at the time I was given permission to change this repository said that kvg:element was used for the visual appearance whereas kvg:original was used for the semantic origin of the components.

You are good at spotting stuff, I hope you contribute more bug reports here.

Only for a while longer.

I think a lot of people will be grateful for any input you want to give, but of course there is no obligation on you.

I'm in the process of creating a new kanji visual indexing system that's more intuitive and unambiguous than the likes of SKIP or Four-corner. and has less of a learning curve than the likes of Cangjie. The obvious downside is that the process is mostly manual, as I have to "index" one kanji at a time, and occasionally go back and re-index previous ones if I see a flaw in the method (e.g. having to go back and separate 雷 and 冨 from 雨 and 同 due to 冖 being separate from 冂).

I wish you good luck with your system. If you post a message to the KanjiVG mailing list about your project, or send a pull request, I'll add it to the "Projects using KanjiVG" page.

Having accurate KanjiVG data is invaluable for creating consistent indices.

I'm glad this project has proved useful for you.

I've already indexed ~5500 kanjis of my initial goal of ~6000, so any other reports would most likely either be during the verification pass, or if I decide to expand the target list.

Sounds like a lot of work.

At the moment the final 韭 is in a pull request, and I intend to merge that and close this issue tomorrow if there is no further objection. The kvg:type will probably have to wait for now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Grouping How individual strokes are grouped Stroke order The order of strokes in the character
Projects
None yet
Development

No branches or pull requests

3 participants