Examples of modifying sentence segmentation rules. #109
-
Hi apologies if this is documented - I've looked at current and past issues as well and the only reference I could find is #90 but there doesn't seem to be an explanation. For reference this is the original issue:
Are there any examples of how to modify the current rules in place? I'm looking to use this for clinical text and it seems to offer improvements over another, default implementation of sentence segmentation, particularly when it comes to handling lists. Thanks! |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments
-
Hey @delvinso thanks for using pysbd. Unfortunately, there is no specific documentation about modifying rules as there are so many and each rule is associated with some form of transformation which is taken as a input by other rule. To illustrate it further: Lines 32 to 37 in 5905f13 As you can see above, all those operations needs to be performed in that sequence as they are interrelated. The way these are structured are https://github.com/diasks2/pragmatic_segmenter decision choice, I just ported those from Ruby to Python. The way to tackle your edge cases would be by diving in the source code and see where your sentence is getting segemented wrongly? Pro tip:
Let me know if this helps |
Beta Was this translation helpful? Give feedback.
-
Closing the issue as there is no specific documentation for this. |
Beta Was this translation helpful? Give feedback.
Hey @delvinso thanks for using pysbd.
Unfortunately, there is no specific documentation about modifying rules as there are so many and each rule is associated with some form of transformation which is taken as a input by other rule.
To illustrate it further:
pySBD/pysbd/processor.py
Lines 32 to 37 in 5905f13
As you can see above, all those operations needs to be performed in that sequence as they are interrelated. The way these are structured are h…