Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[TextServer] Fix get_word_breaks and its uses. #79054

Merged
merged 1 commit into from
Jun 22, 2024

Conversation

bruvzg
Copy link
Member

@bruvzg bruvzg commented Jul 5, 2023

Fixes #78646

@bruvzg bruvzg added this to the 4.2 milestone Jul 5, 2023
@bruvzg bruvzg requested review from a team as code owners July 5, 2023 09:49
@akien-mga akien-mga changed the title [TextServer] Fix get_word_breaks and it uses. [TextServer] Fix get_word_breaks and its uses. Jul 5, 2023
@YuriSizov
Copy link
Contributor

YuriSizov commented Nov 9, 2023

@Rindbee Could you test these changes? Also maybe @timothyqiu could help?

@bruvzg And could you rebase your PR so it runs fresh CI, since its been too long? 🙂

@Rindbee
Copy link
Contributor

Rindbee commented Nov 10, 2023

The stuttering of tts is gone. but its smoothness is not as good as using UBRK_SENTENCE instead of UBRK_LINE. Maybe when used for tts, special parameters or methods can be provided.

The results returned by string_get_word_breaks are still not particularly ideal.

English test text: "Godot Engine, the game engine you've been waiting for."

Before:
chars_per_line = 0:  [0, 5, 6, 12, 14, 17, 18, 22, 23, 29, 30, 33, 34, 36, 37, 41, 42, 49, 50, 54] 

[Godot]
[Engine]
[the]
[game]
[engine]
[you]
[ve]
[been]
[waiting]
[for.]

chars_per_line = 5:  [0, 5, 6, 11, 11, 12, 14, 17, 18, 22, 23, 28, 28, 33, 34, 36, 37, 41, 42, 47, 47, 49, 50, 54]

[Godot]
[Engin]
[e]
[the]
[game]
[engin]
[e you]
[ve]
[been]
[waiti]
[ng]
[for.]

After:
chars_per_line = 0:  [0, 5, 6, 12, 14, 17, 18, 22, 23, 29, 30, 33, 34, 36, 37, 41, 42, 49, 50, 53] 

[Godot]
[Engine]
[the]
[game]
[engine]
[you]
[ve]
[been]
[waiting]
[for]

chars_per_line = 5:  [0, 5, 6, 11, 11, 13, 14, 17, 18, 22, 23, 28, 28, 33, 33, 36, 37, 41, 42, 47, 47, 49, 50, 54]

[Godot]
[Engin]
[e,]
[the]
[game]
[engin]
[e you]
['ve]
[been]
[waiti]
[ng]
[for.]

It is worth noting that the test results for the English text changed.

Chinese test text: "天行健,君子以自强不息;地势坤,君子以厚德载物"

Before:
chars_per_line = 0:  [0, 1, 0, 1, 0, 3, -1, 0, -1, 0, -1, 0, -1, 0, -1, 0, -1, 0, 10, 11, -1, 0, -1, 0, 14, 15, -1, 0, -1, 0, -1, 0, -1, 0, -1, 0, -1, 0, 22, 23] 

[天]
[天]
[天行健]
[]
[]
[]
[]
[]
[]
[息]
[]
[]
[坤]
[]
[]
[]
[]
[]
[]
[物]

chars_per_line = 5:  [0, 3, 10, 15, 22, 23]

[天行健]
[息;地势坤]
[物]

After:
chars_per_line = 0:  [0, 1, 1, 2, 2, 3, 4, 6, 6, 7, 7, 9, 9, 11, 12, 14, 14, 15, 16, 18, 18, 19, 19, 21, 21, 23] 

[天]
[行]
[健]
[君子]
[以]
[自强]
[不息]
[地势]
[坤]
[君子]
[以]
[厚德]
[载物]

chars_per_line = 5:  [0, 4, 4, 9, 9, 14, 14, 19, 19, 23]

[天行健,]
[君子以自强]
[不息;地势]
[坤,君子以]
[厚德载物]

When chars_per_line = 0, the breaks here seem to have no rules.
In the case of chars_per_line = 5, if a punctuation mark is encountered, English will still break even if it does not reach 5, but Chinese will not.

@akien-mga akien-mga modified the milestones: 4.2, 4.3 Nov 10, 2023
@bruvzg
Copy link
Member Author

bruvzg commented Jun 21, 2024

The results returned by string_get_word_breaks are still not particularly ideal.

Not sure if I see anything wrong with it, the difference is only in presence of space, English text breaks on punctuation since it can't fit the next word and a space, so behavior is the same for both.

@akien-mga akien-mga merged commit 16ab534 into godotengine:master Jun 22, 2024
16 checks passed
@akien-mga
Copy link
Member

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

TextServer::string_get_word_breaks() may return wrong results for Chinese text
4 participants