Avoid splitting on initial hyphens in --foo-bar #14

mgeisler · 2017-01-16T16:56:41Z

We should only split on hyphens surrounded by non-hyphens or perhaps surrounded by word-characters.

A single or double hyphen in the beginning of a word is not a valid break-point. Double hyphens often appear with in texts about long command-line options such as --dry-run or --ignore-backups. Fixes #14.

This adds a new optional dependency on the unicode-linebreak crate, which implements the line breaking algorithm from [Unicode Standard Annex #14](https://www.unicode.org/reports/tr14/). The new dependency is enabled by default since these line breaks are more correct than what you get by splitting on whitespace. This should help address #220 and #80, though I’m no expert on non-Western languages. More feedback from the community would be needed here.

This adds a new optional dependency on the unicode-linebreak crate, which implements the line breaking algorithm from [Unicode Standard Annex #14](https://www.unicode.org/reports/tr14/). We can use this to find words in non-ASCII text. The new dependency is enabled by default since these line breaks are more correct than what you get by splitting on ASCII space. This should help address #220 and #80, though I’m no expert on non-Western languages. More feedback from the community would be needed here.

mgeisler added the bug label Jan 16, 2017

mgeisler mentioned this issue Jan 19, 2017

Avoid splitting on hyphens in cmdline options #17

Merged

mgeisler closed this as completed in #17 Jan 21, 2017

tavianator mentioned this issue Nov 26, 2020

Does not work for languages without word separators #220

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Avoid splitting on initial hyphens in --foo-bar #14

Avoid splitting on initial hyphens in --foo-bar #14

mgeisler commented Jan 16, 2017

Avoid splitting on initial hyphens in --foo-bar #14

Avoid splitting on initial hyphens in --foo-bar #14

Comments

mgeisler commented Jan 16, 2017