Skip to content
This repository has been archived by the owner on Mar 26, 2021. It is now read-only.

Prevent hyphenation of short words in Russian #17

Open
Walkeryr opened this issue Apr 9, 2014 · 2 comments
Open

Prevent hyphenation of short words in Russian #17

Walkeryr opened this issue Apr 9, 2014 · 2 comments

Comments

@Walkeryr
Copy link

Walkeryr commented Apr 9, 2014

I've seen this example in the source code:

Short words are not hyphenated

>>> hyphenate("<p>The brave men, living and dead.</p>")
u'<p>The brave men, liv&shy;ing and dead.</p>'

This doens't hold for Russian language where 5 letter words got hyphenated, how can I control this behavior?

@palewire
Copy link
Contributor

palewire commented Apr 9, 2014

Interesting question.

I'm not sure I have the answer, being ignorant of Russian hyphenation rules. This library uses a Russian dictionary by Peter Novodvorsky in the dicts directory. You can read more about it here and the dictionary itself is here.

It might be possible to add some option to the hyphenator that ignores word tokens below a certain size, but my recollection is there's nothing in the code that approaches that right now.

@palewire
Copy link
Contributor

palewire commented Apr 9, 2014

Perhaps adding a character limit greater than zero here might do it?

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants