feat(airbyte-cdk): replace pydantic BaseModel
with dataclasses
in protocol
#44026
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
What
Resolving https://github.com/airbytehq/airbyte-internal-issues/issues/8945 by use of dataclasses
How
dataclasses.dataclass
as outputairbyte-cdk/python/airbyte_cdk/models/airbyte_protocol.py
orjson.dumps().decode()
for serializationReview guide
airbyte-cdk/python/airbyte_cdk/models/airbyte_protocol.py
airbyte-cdk/python/airbyte_cdk/models/well_known_types.py
Note
In this PR pydantic was completely replaced with dataclasses, but we can (should??) consider replacing only part of protocol entities (heavily used ones).
User Impact
-- changes in protocol models only
known problem
Caution
orjson serializes all fields (even None)
Tests
Test with docker run; 10_000_000 records; stream: dummy_records
/usr/bin/time -h docker run --rm -v $(pwd)/secrets:/secrets -v $(pwd)/integration_tests:/integration_tests airbyte/source-hardcoded-records:dev read --config /secrets/config.json --catalog /integration_tests/configured_catalog.json > /tmp/test.txt
10,000,000 records are about 1,770 MB.
Test with local platform (deployed with
abctl
)24,000,000 records are about 4,270 MB.
Can this PR be safely reverted and rolled back?