-
Notifications
You must be signed in to change notification settings - Fork 128
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implosion/to_s problem with Enclitics #68
Comments
it's is a contraction - for tokenisation contractions are often considered two words (because they are really) - this is the case in Stanford Core - http://stackoverflow.com/questions/14058399/stanford-corenlp-split-words-ignoring-apostrophe One option, as suggested in the above link, would be to handle imploding enclitics in the implode method - in treat this would be in module Treat::Entities::Entity::Stringable |
so - looks like the issue is with the current implode method on string able - although it attempts to handle enclitics then from what i can see in the current implementation then 'value' would already be blank, so calling strip! would make no difference - when the imploded parts are merged the space is still there (as it is outside the scope of the strip!) here's a fixed version - modified the recursive call to pass the value string and operations are all performed on the string instead of multiple copies - but a disclaimer is that i only started looking at treat about 3 hours ago! for the same code, this now gives:
|
Results in:
Should that to_s without the extra space between It and `s?
The text was updated successfully, but these errors were encountered: