-
Notifications
You must be signed in to change notification settings - Fork 5.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support custom text formats and recursive #496
Conversation
Codecov ReportAttention:
Additional details and impacted files@@ Coverage Diff @@
## main #496 +/- ##
===========================================
+ Coverage 29.75% 48.38% +18.63%
===========================================
Files 27 27
Lines 3455 3466 +11
Branches 782 826 +44
===========================================
+ Hits 1028 1677 +649
+ Misses 2353 1603 -750
- Partials 74 186 +112
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have only minor suggestions. Again, please try to find a group of reviewers for RAG and frequently engage with them.
* Add custom text types and recursive * Add custom text types and recursive * Fix format * Update qdrant, Add pdf to unstructured * Use unstructed as the default text extractor if installed * Add tests for unstructured * Update tests env for unstructured * Fix error if last message is a function call, issue microsoft#569 * Remove csv, md and tsv from UNSTRUCTURED_FORMATS * Update docstring of docs_path * Update test for get_files_from_dir * Update docstring of custom_text_types * Fix missing search_string in update_context * Add custom_text_types to notebook example
Why are these changes needed?
Address #408
Also, address #569
Moreover, made some other improvements during the implementation of this PR:
Related issue number
Closes #408
Closes #569
Checks