Feature/make parsing chunking providers #820
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Summary:
Introduced chunking and parsing providers, updated configurations, and modified pipelines and tests to integrate new providers.
Key points:
ChunkingProvider
andParsingProvider
classes inr2r/base/providers/chunking.py
andr2r/base/providers/parsing.py
.r2r.toml
andpyproject.toml
to include new chunking and parsing configurations.r2r/base/__init__.py
to import new providers.ChunkingPipe
andParsingPipe
inr2r/pipes/ingestion
.IngestionPipeline
inr2r/pipelines/ingestion_pipeline.py
to use new pipes.R2RChunkingProvider
andUnstructuredChunkingProvider
inr2r/providers/chunking
.R2RParsingProvider
andUnstructuredParsingProvider
inr2r/providers/parsing
.R2RConfig
inr2r/main/assembly/config.py
to handle new configurations.R2RProviderFactory
inr2r/main/assembly/factory.py
to create new providers.tests/test_config.py
to cover new configurations.Generated with ❤️ by ellipsis.dev