Releases: nipunsadvilkar/pySBD
Releases · nipunsadvilkar/pySBD
v0.3.4: Fix trailing period/ellipses with spaces
v0.3.3: Better handling consecutive periods and reserved special symbols
- 🐛 Better handling consecutive periods and reserved special symbols - allenai/scholarphi#114
- Add CONTRIBUTING.md
v0.3.2 : Enforce clean=True when doc_type="pdf"
- 🐛 ✅ Enforce clean=True when doc_type="pdf" - #75
v0.3.1 : Handle Newline character & update tests
v0.3.1
- 🚑 ✅ Handle Newline character & update tests
v0.3.0: Multi-lang support & performance improvements
♻️ ✨ Refactoring for more language support & sent char_span fix
- ✨ 💫 sent
char_span
through with spaCy & regex approach - #63 - ♻️ Refactoring to support multiple languages
- ✨ 💫Initial language support for - Hindi, Marathi, Chinese, Spanish
- ✅ Updated tests - more coverage & regression tests for issues
- 👷👷🏻♀️ GitHub actions for CI-CD
- 💚☂️ Add code coverage - coverage.py Add Codecov
- 🐛 Fix incorrect text span & vanilla pysbd vs spacy output discrepancy - #49, #53, #55 , #59
- 🐛 Fix
NUMBERED_REFERENCE_REGEX
for zero or one time - #58 - 🔐Fix security vulnerability bleach - #62
Performance improvement in `abbreviation_replacer`
🐛 Performance improvement in abbreviation_replacer
by reducing re.sub calls - @danielkingai2 #50
🐛 Fix unbalanced parenthesis
- 🐛 Fix unbalanced parenthesis - #47
✨ `pysbd` as a spaCy component
- ✨
pysbd
as a spacy component through entrypoints
✨Add `char_span` functionality, pySBD as a spaCy component
- ✨Add
char_span
parameter (optional) to get sentence & its (start, end) char offsets from original text - ✨pySBD as a spaCy component example
- 🐛 Fix double question mark swallow bug - #39