Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: nltk text splitter support #3403

Merged

Conversation

uladkaminski
Copy link
Contributor

Added NLTKTextSplitter support. This text splitter uses The Natural Language Toolkit, or more commonly NLTK, a suite of libraries and programs for symbolic and statistical natural language processing (NLP).

Rather than just splitting on "\n\n", we can use NLTK to split based on NLTK tokenizers.

nltk_demo.mov

@dosubot dosubot bot added size:M This PR changes 30-99 lines, ignoring generated files. enhancement New feature or request labels Aug 16, 2024
Copy link
Contributor

Pull Request Validation Report

This comment is automatically generated by Conventional PR

Whitelist Report

Whitelist Active Result
Pull request is a draft and should be ignored
Pull request is made by a whitelisted user and should be ignored
Pull request is submitted by a bot and should be ignored
Pull request is submitted by administrators and should be ignored

Result

Pull request does not satisfy any enabled whitelist criteria. Pull request will be validated.

Validation Report

Validation Active Result
All commits in this pull request has valid messages
Pull request does not introduce too many changes
Pull request has a valid title
Pull request has mentioned issues
Pull request has valid branch name
Pull request should have a non-empty body

Result

Pull request satisfies all enabled pull request rules.

Last Modified at 16 Aug 24 23:39 UTC

@github-actions github-actions bot added enhancement New feature or request and removed enhancement New feature or request labels Aug 16, 2024
Copy link

This pull request is automatically being deployed by Amplify Hosting (learn more).

Access this pull request here: https://pr-3403.dmtpw4p5recq1.amplifyapp.com

Copy link
Contributor

@ogabrielluiz ogabrielluiz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, @uladkaminski

LGTM

@dosubot dosubot bot added the lgtm This PR has been approved by a maintainer label Aug 20, 2024
@ogabrielluiz ogabrielluiz enabled auto-merge (squash) August 20, 2024 18:46
@ogabrielluiz ogabrielluiz merged commit 1a93639 into langflow-ai:main Aug 20, 2024
29 checks passed
@uladkaminski uladkaminski deleted the feature/nltk_text_splitter branch August 20, 2024 19:07
ogabrielluiz pushed a commit that referenced this pull request Aug 27, 2024
* feat: nltk text splitter support

* feat: add doc link to nltk text splitter

* [autofix.ci] apply automated fixes

---------

Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request lgtm This PR has been approved by a maintainer size:M This PR changes 30-99 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants