Rework TextEdit arrow navigation to handle Unicode graphemes#5812
Rework TextEdit arrow navigation to handle Unicode graphemes#5812lucasmerlin merged 8 commits intoemilk:masterfrom
TextEdit arrow navigation to handle Unicode graphemes#5812Conversation
|
Preview available at https://egui-pr-preview.github.io/pr/5812-unicode-grapheme-navigation |
|
I did a quick check, and this increases the .wasm size by ~50 kB, which I think is acceptable (it's because of the tables here: https://github.com/unicode-rs/unicode-segmentation/blob/master/src/tables.rs) |
|
I just reworked the word splitting because I found out it complete fell apart around emojis. Then I saw the The new unicode implementation may be useful, if used at a larger scale in the future (not just for word splitting in text edit). But currently the local-only effect of the dependency may not be worth what it brings compared to allowing non-ASCII characters in the existing implementation. |
|
This may end up covering the same ground as #5784. |
|
That's true, though I think this can be finalized and merged and later replaced by #5784. |
|
@valadaptive do you think merging this PR will help or hinder your parley work? |
|
I think I'm going to need to redo it from scratch anyway, so go ahead and merge this. |
# Conflicts: # crates/egui/src/text_selection/text_cursor_state.rs # crates/egui/src/widgets/text_edit/text_buffer.rs
…#5812) * [x] I have followed the instructions in the PR template Previously, navigating text in `TextEdit` with Ctrl + left/right arrow would jump inside words that contained combining characters (i.e. diacritics). This PR introduces new dependency of `unicode-segmentation` to handle grapheme encoding. The new implementation ignores whitespace and other separators such as `-` (dash) between words, but respects `_` (underscore). --------- Co-authored-by: lucasmerlin <hi@lucasmerlin.me>
Previously, navigating text in
TextEditwith Ctrl + left/right arrow would jump inside words that contained combining characters (i.e. diacritics). This PR introduces new dependency ofunicode-segmentationto handle grapheme encoding. The new implementation ignores whitespace and other separators such as-(dash) between words, but respects_(underscore).