Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Check if regex boundary is needed #7167

Closed
tabergma opened this issue Nov 3, 2020 · 1 comment · Fixed by #7471
Closed

Check if regex boundary is needed #7167

tabergma opened this issue Nov 3, 2020 · 1 comment · Fixed by #7471
Assignees
Labels
area:rasa-oss 🎡 Anything related to the open source Rasa framework type:enhancement ✨ Additions of new features or changes to existing ones, should be doable in a single PR

Comments

@tabergma
Copy link
Contributor

tabergma commented Nov 3, 2020

Description of Problem:
Currently, we are adding a \b (boundary) to every regex for the RegexFeaturizer and RegexEntityExtractor. In #7091 a user mentioned that this is not working for non-whitespace tokenizable language, such as Chinese.
The question is, do we actually need to add the boundary at all? (See slack thread for reference.)

Overview of the Solution:
Figure out why the boundary was added in the first place and check if it is still needed. If not, remove it from the code.

@tabergma tabergma added type:enhancement ✨ Additions of new features or changes to existing ones, should be doable in a single PR area:rasa-oss 🎡 Anything related to the open source Rasa framework labels Nov 3, 2020
@tabergma
Copy link
Contributor Author

tabergma commented Dec 4, 2020

related to #7421

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:rasa-oss 🎡 Anything related to the open source Rasa framework type:enhancement ✨ Additions of new features or changes to existing ones, should be doable in a single PR
Projects
None yet
2 participants