Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Port BaseLoader + BaseTextSplitter interfaces and TextLoader + CharacterTextSplitter impl. #8

Merged
merged 7 commits into from
Apr 24, 2023

Conversation

realdavidvega
Copy link
Contributor

This PR ports BaseLoader + BaseTextSplitter interfaces, and TextLoader + CharacterTextSplitter impl. respectively.

Also adds some simple tests.

@realdavidvega realdavidvega requested a review from nomisRev April 21, 2023 16:41
@realdavidvega
Copy link
Contributor Author

@franciscodr @nomisRev @gutiory @raulraja ready for review.

Copy link
Contributor

@nomisRev nomisRev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome work @realdavidvega 🙌

I think we need to use nested use because both Source and Buffer are Closeable. That's what I did in PromptTemplate. Did you find anywhere in the docs, or code, that the Buffer wrapping the Source also closes the Source? If yes then the nested use is not needed, and can also be removed from PromptTemplate.

@realdavidvega
Copy link
Contributor Author

realdavidvega commented Apr 24, 2023

Awesome work @realdavidvega 🙌

I think we need to use nested use because both Source and Buffer are Closeable. That's what I did in PromptTemplate. Did you find anywhere in the docs, or code, that the Buffer wrapping the Source also closes the
Source? If yes then the nested use is not needed, and can also be removed from PromptTemplate.

The documentation is a bit confusing, sometimes they use two nested use, but on other examples they are just using one FileSystem.source(path).buffer().use.

Also seems there's a documented function FileSystem.read() that is used to: "buffer the source before our block and close the source afterwards.", which does the same. So maybe we could use that function instead.

@realdavidvega realdavidvega merged commit 6ee648a into main Apr 24, 2023
@realdavidvega realdavidvega deleted the document-loader branch April 24, 2023 14:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants