Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Switch to API for Unstructured Component #3671

Merged

Conversation

erichare
Copy link
Collaborator

@erichare erichare commented Sep 3, 2024

This pull request enforces the use of the Unstructured Serverless API in the component, avoiding the need for the OSS PDF parsing libraries and NLTK downloads.

Copy link

This pull request is automatically being deployed by Amplify Hosting (learn more).

Access this pull request here: https://pr-3671.dmtpw4p5recq1.amplifyapp.com

@erichare erichare marked this pull request as ready for review September 3, 2024 19:01
@dosubot dosubot bot added size:M This PR changes 30-99 lines, ignoring generated files. enhancement New feature or request labels Sep 3, 2024
@erichare
Copy link
Collaborator Author

erichare commented Sep 3, 2024

@ogabrielluiz This PR backs out the remaining code from the NLTK downloading. It adds one new dependency, langchain-unstructured (which apparently also does a slight downgrade on the unstructured-client version) to support the UnstructuredLoader. The component now requires the input of a serverless API key and does server-side processing for Unstructured, meaning no issues with local dependencies.

Copy link
Contributor

@ogabrielluiz ogabrielluiz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@ogabrielluiz ogabrielluiz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@dosubot dosubot bot added the lgtm This PR has been approved by a maintainer label Sep 4, 2024
@ogabrielluiz ogabrielluiz changed the title FEAT: Switch to API for Unstructured Component feat: Switch to API for Unstructured Component Sep 4, 2024
@github-actions github-actions bot added enhancement New feature or request and removed enhancement New feature or request labels Sep 4, 2024
@ogabrielluiz ogabrielluiz merged commit de1fdff into langflow-ai:main Sep 4, 2024
28 of 29 checks passed
@erichare erichare deleted the feat/unstructured-api-component branch September 25, 2024 18:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request lgtm This PR has been approved by a maintainer size:M This PR changes 30-99 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants