-
-
Notifications
You must be signed in to change notification settings - Fork 61
Partially test consistency of grapheme cluster segmentation with canonical equivalence, and fix it for LGCs #645
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
eggrobin
merged 58 commits into
unicode-org:main
from
eggrobin:canonically-consistent-grapheme-clusters
Jan 26, 2024
Merged
Changes from 6 commits
Commits
Show all changes
58 commits
Select commit
Hold shift + click to select a range
cc76245
A test which I expected to fail, but not in this way
eggrobin e23d1c1
Pre-16 and NFKCQC
eggrobin 24fe8e1
🤪
eggrobin 328c761
Canonical closure tests
eggrobin 2d0ceaf
Generate canonical closures
eggrobin 3880c4f
Some interesting sequences
eggrobin b3b53c0
Some very crappy code
eggrobin 22dfd8c
Drop Hangul and make sure we have all overlaps
eggrobin a742327
Split it into its own part and look at chaining compositions, not dec…
eggrobin 182cc3a
despam
eggrobin 53459b0
spots
eggrobin 5f16271
Regenerate UCD
eggrobin 747f982
Some comments.
eggrobin 695c95e
Allow a single non-decomposable starter at either end of the chain
eggrobin 9fea9ea
Deduplicate parts 4 and 5
eggrobin 7362f2d
Remove redundant test cases in NFC (covered by the NFC column of othe…
eggrobin cdd391a
Clean things up
eggrobin 7bcb9b4
more cleanup
eggrobin cf4275c
more cleanup
eggrobin 3cb23ac
More testing
eggrobin e41b3ea
Fix the QC properties
eggrobin 0c312ce
stray import
eggrobin 361a977
factor
eggrobin 0380b27
report all failures
eggrobin 85c2b67
Failing test
eggrobin 6188e19
an attempt at error messages
eggrobin afc7d8c
comma
eggrobin 29b341f
table and less escaping
eggrobin 0497007
Try to get the errors only once
eggrobin 4ca3390
We have screwed up since the beginning of time.
eggrobin 80acf01
Revert invariant tests
eggrobin 93c9570
Report various kinds of errors
eggrobin 281f70b
report parse errors
eggrobin 70a4ee2
Break everything
eggrobin 03b74f2
make it a bit more readable hopefully
eggrobin 4ac8e84
Revert "Break everything"
eggrobin 3ba16f5
It is only an error if it is not what we expect.
eggrobin 546071e
Revert "Revert invariant tests"
eggrobin 008fa45
Put the condition in the right place
eggrobin 40c4eab
Merge branch 'invariant-test-in-diff' into canonically-consistent-gra…
eggrobin 1c84e41
Document our past mistakes, don’t expect them to go away
eggrobin c107514
Merge branch 'normalization-woes' into canonically-consistent-graphem…
eggrobin 85eca4e
bad expectations
eggrobin 963735f
fix it
eggrobin a93e7d1
Regenerate UCD
eggrobin 47ae9c4
Fehlermeldungszeilen
eggrobin cdb3107
Merge remote-tracking branch 'la-vache/main' into invariant-test-in-diff
eggrobin bbee45c
Merge branch 'invariant-test-in-diff' into canonically-consistent-gra…
eggrobin 7a6220b
Markus’s suggestions
eggrobin 89cdf7a
Merge remote-tracking branch 'la-vache/main' into normalization-woes
eggrobin e1a01ed
More honest primaryCompositesByMeowNFDCodePoint maps
eggrobin b0b4cf6
Regenerate UCD
eggrobin 910039c
Merge branch 'normalization-woes' of https://github.com/eggrobin/unic…
eggrobin c21622e
spotless
eggrobin 66e7296
Merge remote-tracking branch 'la-vache/main' into normalization-woes
eggrobin 2c57460
Merge branch 'normalization-woes' into canonically-consistent-graphem…
eggrobin f58c609
Merge remote-tracking branch 'la-vache/main' into canonically-consist…
eggrobin 6830f83
Merge remote-tracking branch 'la-vache/main' into canonically-consist…
eggrobin File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice addition, and cleanup
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that I split that out into its own PR #646, which will thus need its own approval.