-
Notifications
You must be signed in to change notification settings - Fork 822
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: per-process ingest connections #1058
Merged
Merged
Changes from all commits
Commits
Show all changes
38 commits
Select commit
Hold shift + click to select a range
34137ca
more service to instance variable on connector
ryannikolaidis ffdc081
tidy
ryannikolaidis 97caff9
add session handler
ryannikolaidis 17420fc
in progress
ryannikolaidis 5edd0a0
remove processor resource
ryannikolaidis 05e824d
remove debug log
ryannikolaidis 69ad5d3
remove unneeded
ryannikolaidis d74e74f
clean up
ryannikolaidis 71e5201
actually set the session_handle
ryannikolaidis 1fbe77c
actually get this working and lint free
ryannikolaidis 27e6c00
actually resolve lint
ryannikolaidis d57eebc
update comments
ryannikolaidis e767e65
tidy
ryannikolaidis 0be0182
end of doc line
ryannikolaidis 48c17dc
use context
ryannikolaidis 058a85d
fix tests
ryannikolaidis 352acb2
Merge branch 'main' into ryan/reuse-connections
ryannikolaidis fd2f27b
tidy
ryannikolaidis c4074d3
bump note
ryannikolaidis d11d4ea
Merge branch 'main' into ryan/reuse-connections
ryannikolaidis 626a659
bump version
ryannikolaidis 4071a35
bump docstring
ryannikolaidis 7391ea1
lint
ryannikolaidis cdb50e2
debugging test failure
ryannikolaidis dbebe08
debug
ryannikolaidis aa7de05
more debug
ryannikolaidis 7e6164d
re-enable
ryannikolaidis 3d7f7e1
Merge branch 'main' into ryan/reuse-connections
ryannikolaidis b38433a
bump test
ryannikolaidis cf3de88
bump version
ryannikolaidis 9b097bb
version bump
ryannikolaidis 6513520
Update unstructured/ingest/interfaces.py
ryannikolaidis 1f64a06
Merge branch 'main' into ryan/reuse-connections
ryannikolaidis 3ffeaae
manage only in subprocess
ryannikolaidis 29b97f5
bump comment
ryannikolaidis 18433e3
bump comment
ryannikolaidis cc7149b
Merge branch 'main' into ryan/reuse-connections
ryannikolaidis 171f4a6
version bump
ryannikolaidis File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
39 changes: 39 additions & 0 deletions
39
test_unstructured_ingest/unit/doc_processor/test_generalized.py
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,39 @@ | ||
from dataclasses import dataclass | ||
|
||
import pytest | ||
|
||
from unstructured.ingest.doc_processor.generalized import ( | ||
process_document, | ||
) | ||
from unstructured.ingest.interfaces import BaseIngestDoc, IngestDocSessionHandleMixin | ||
|
||
|
||
@dataclass | ||
class IngestDocWithSessionHandle(IngestDocSessionHandleMixin, BaseIngestDoc): | ||
pass | ||
|
||
def test_process_document_with_session_handle(mocker): | ||
"""Test that the process_document function calls the doc_processor_fn with the correct | ||
arguments, assigns the session handle, and returns the correct results.""" | ||
mock_session_handle = mocker.MagicMock() | ||
mocker.patch("unstructured.ingest.doc_processor.generalized.session_handle", mock_session_handle) | ||
mock_doc = mocker.MagicMock(spec=(IngestDocWithSessionHandle)) | ||
|
||
result = process_document(mock_doc) | ||
|
||
mock_doc.get_file.assert_called_once_with() | ||
mock_doc.write_result.assert_called_with() | ||
mock_doc.cleanup_file.assert_called_once_with() | ||
assert result == mock_doc.process_file.return_value | ||
assert mock_doc.session_handle == mock_session_handle | ||
|
||
|
||
def test_process_document_no_session_handle(mocker): | ||
"""Test that the process_document function calls does not assign session handle the IngestDoc | ||
does not have the session handle mixin.""" | ||
mocker.patch("unstructured.ingest.doc_processor.generalized.session_handle", mocker.MagicMock()) | ||
mock_doc = mocker.MagicMock(spec=(BaseIngestDoc)) | ||
|
||
process_document(mock_doc) | ||
|
||
assert not hasattr(mock_doc, "session_handle") |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1 +1 @@ | ||
__version__ = "0.10.2" # pragma: no cover | ||
__version__ = "0.10.3" # pragma: no cover |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed this. it wasn't critical to the intent of this test, but also needing to freeze time here in combination with other tests touching generalized (and by extension calling
get_model
) was triggering a bizarre failure with importing transformers.models.open_llama.tokenization_open_llama? More info here.