-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(low-code cdk): add transformation to dynamic schema loader #176
base: main
Are you sure you want to change the base?
Conversation
📝 Walkthrough📝 WalkthroughWalkthroughThis pull request introduces a new property named Changes
Possibly related PRs
Suggested reviewers
Tip CodeRabbit's docstrings feature is now available as part of our Early Access Program! Simply use the command Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
airbyte_cdk/sources/declarative/declarative_component_schema.yaml (1)
2517-2526
: LGTM! Consider adding clarifying documentation?The implementation looks good and aligns well with the PR objectives. The new
transformations
property follows the same pattern as stream-level transformations.What do you think about adding a description that clarifies when to use record selector transformations vs stream transformations? This could help users choose the appropriate level for their transformations. Something like:
transformations: title: Transformations - description: A list of transformations to be applied to each output record. + description: A list of transformations to be applied to each output record during the record selection phase. Use these transformations when you need to modify records before they are processed by stream-level transformations.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
airbyte_cdk/sources/declarative/declarative_component_schema.yaml
(1 hunks)airbyte_cdk/sources/declarative/models/declarative_component_schema.py
(9 hunks)airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py
(1 hunks)unit_tests/sources/declarative/parsers/test_model_to_component_factory.py
(4 hunks)
🔇 Additional comments (3)
airbyte_cdk/sources/declarative/models/declarative_component_schema.py (1)
1510-1516
: LGTM! Nice addition of the transformations field to RecordSelector.
The optional transformations field allows for flexible record manipulation through various transformation types. The implementation maintains backward compatibility since it's optional.
airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py (1)
1926-1931
: LGTM! Clean implementation of transformation handling.
The implementation correctly creates transformation components from the models and appends them to the transformations list. The code is straightforward and follows the factory pattern consistently.
unit_tests/sources/declarative/parsers/test_model_to_component_factory.py (1)
182-185
: LGTM! Great test coverage for the transformation functionality.
The tests thoroughly verify the transformation handling in both the schema parsing and component creation. The coverage includes both AddFields and RemoveFields transformations, ensuring the feature works as expected.
Also applies to: 302-309, 1315-1318, 1340-1340
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
airbyte_cdk/sources/declarative/models/declarative_component_schema.py (1)
1502-1508
: LGTM! Consider adding an example to the field description?The implementation looks good and aligns well with the PR objectives to allow transformations in the record selector. The type definition and documentation are clear.
Would you consider adding an example to the field description to make it more user-friendly? Something like:
examples=[{ "transformations": [ {"type": "AddFields", "fields": [{"path": ["extra_field"], "value": "{{ record['some_field'] }}"}]}, {"type": "RemoveFields", "field_pointers": [["sensitive_data"]]} ] }]wdyt?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (2)
unit_tests/sources/declarative/parsers/test_model_to_component_factory.py (2)
182-185
: LGTM! Consider enhancing test coverage?The addition of transformations to the record selector looks good. Would you consider adding a test case with multiple field pointers to ensure the RemoveFields transformation handles multiple fields correctly? wdyt?
2250-2268
: LGTM! Consider adding edge cases?The test case for AddFields transformation with string value type is well-structured. Would you consider adding test cases for edge cases like empty strings or special characters? wdyt?
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
unit_tests/sources/declarative/parsers/test_model_to_component_factory.py
(10 hunks)
🔇 Additional comments (2)
unit_tests/sources/declarative/parsers/test_model_to_component_factory.py (2)
2198-2201
: LGTM! Clear and helpful comment
The comment effectively explains why transformations appear twice in the test results. This helps future maintainers understand the expected behavior.
2216-2234
: LGTM! Well-structured test case
The test case for AddFields transformation without value type is thorough and clearly structured. The expected results are well-defined and the comment explains the duplication clearly.
unit_tests/sources/declarative/parsers/test_model_to_component_factory.py
Outdated
Show resolved
Hide resolved
unit_tests/sources/declarative/parsers/test_model_to_component_factory.py
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (2)
airbyte_cdk/sources/declarative/models/declarative_component_schema.py (2)
1650-1656
: Consider adding validation for the name fieldThe name field is marked as optional with a default empty string. Should we add some validation to ensure the name follows a specific pattern when provided? For example, ensuring it's a valid Python identifier? wdyt?
name: Optional[str] = Field( - "", description="The stream name.", example=["Users"], title="Name" + "", + description="The stream name.", + example=["Users"], + title="Name", + regex="^[a-zA-Z_][a-zA-Z0-9_]*$" )Also applies to: 1657-1658
1925-1929
: Consider adding type validation for partition router listsThe partition router can now be a single router or a list of routers. Should we add validation to ensure the list isn't empty when provided? wdyt?
partition_router: Optional[ Union[ CustomPartitionRouter, ListPartitionRouter, SubstreamPartitionRouter, List[ Union[ CustomPartitionRouter, ListPartitionRouter, SubstreamPartitionRouter ] ], ] ] = Field( [], description="PartitionRouter component that describes how to partition the stream, enabling incremental syncs and checkpointing.", title="Partition Router", + min_items=1 )
Also applies to: 2003-2007
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
airbyte_cdk/sources/declarative/declarative_component_schema.yaml
(1 hunks)airbyte_cdk/sources/declarative/models/declarative_component_schema.py
(10 hunks)airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py
(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py
🔇 Additional comments (5)
airbyte_cdk/sources/declarative/declarative_component_schema.yaml (1)
1770-1780
: LGTM! Schema definition matches implementation.
The transformations field schema definition in the YAML file correctly matches the Python implementation, allowing the same transformation types and maintaining consistency.
airbyte_cdk/sources/declarative/models/declarative_component_schema.py (4)
531-533
: OAuth configuration enhancements look good!
The additions to OAuth configuration are well-documented with clear examples. The new fields enable better OAuth flow customization and user input handling.
Also applies to: 827-829, 898-912, 914-919
1848-1862
: LGTM: Transformations field added to DynamicSchemaLoader
The transformations field matches the structure used in DeclarativeStream, maintaining consistency across the codebase.
1971-1973
: LGTM: Optional download_extractor field
The download_extractor field is properly marked as optional with clear typing.
2071-2076
: LGTM: Components resolver field in DynamicDeclarativeStream
The components_resolver field is well-typed with a clear description of its purpose.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (5)
airbyte_cdk/sources/declarative/schema/dynamic_schema_loader.py (2)
107-107
: Could you add a docstring?
This adds the list of transformations with a factory default. Perhaps consider adding a short docstring for clarity, wdyt?
133-134
: Check if None is more explicit.
You're calling_transform(properties, {})
. If it's not used, passing None might be clearer. Would you consider that, wdyt?unit_tests/sources/declarative/schema/test_dynamic_schema_loader.py (1)
67-71
: Additional test coverage for schema transformations.
You introduced a new "KeysToSnakeCase" transformation. Maybe add another transformation variety to ensure robust coverage, wdyt?airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py (1)
597-601
: create_keys_to_snake_transformation method.
This direct return is simple. Would it help if we passed parameters to the transformation, or is this minimal signature enough, wdyt?airbyte_cdk/sources/declarative/declarative_component_schema.yaml (1)
1770-1780
: LGTM! The schema_transformations implementation looks clean and well-structured.The implementation follows the existing patterns in the codebase and aligns perfectly with the PR objectives. One small suggestion: would you consider adding a default empty array value for consistency with other similar fields in the schema? wdyt?
schema_transformations: title: Schema Transformations description: A list of transformations to be applied to the schema. type: array + default: [] items: anyOf: - "$ref": "#/definitions/AddFields" - "$ref": "#/definitions/CustomTransformation" - "$ref": "#/definitions/RemoveFields" - "$ref": "#/definitions/KeysToLower" - "$ref": "#/definitions/KeysToSnakeCase"
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (5)
airbyte_cdk/sources/declarative/declarative_component_schema.yaml
(1 hunks)airbyte_cdk/sources/declarative/models/declarative_component_schema.py
(1 hunks)airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py
(6 hunks)airbyte_cdk/sources/declarative/schema/dynamic_schema_loader.py
(4 hunks)unit_tests/sources/declarative/schema/test_dynamic_schema_loader.py
(4 hunks)
🔇 Additional comments (13)
airbyte_cdk/sources/declarative/schema/dynamic_schema_loader.py (5)
7-7
: Imports look good.
No issues found here, wdyt?
16-16
: Direct import of RecordTransformation recognized.
Looks straightforward and helps ensure typed transformations. wdyt?
18-18
: Added dependencies for Config, StreamSlice, and StreamState.
Including these type hints keeps the code more expressive. wdyt?
138-138
: Schema snippet alignment.
The “properties” key is updated to the transformed schema. This is consistent with standard JSON Schema usage, wdyt?
141-153
: Order of transformations is enforced implicitly.
All transformations are applied in a loop, ensuring the final state is cumulative. If the order matters, we may want more explicit documentation. Would that add clarity, wdyt?
unit_tests/sources/declarative/schema/test_dynamic_schema_loader.py (3)
238-238
: Renamed property to 'first_name'.
This aligns with the transformation logic. Changes look correct, wdyt?
253-254
: Sample data matches new property names.
This looks consistent with code changes. wdyt?
265-267
: Confirm naming logic for "FirstName" → "first_name".
The test properly verifies the snake-case transformation. Looks good, wdyt?
airbyte_cdk/sources/declarative/models/declarative_component_schema.py (1)
1834-1848
: New field for schema transformations.
This field makes transformations more flexible. Seems well-defined. wdyt?
airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py (4)
239-241
: New import for KeysToSnakeCaseModel.
Clear and consistent with the rest of the transformation imports. wdyt?
481-481
: Mapping KeysToSnakeCase in the factory.
This integration ensures your transformation can be constructed correctly. wdyt?
1653-1659
: schema_transformations list building.
You gather transformations into a list before injecting them. This is clean and readable, wdyt?
1674-1674
: Passing the transformations into DynamicSchemaLoader.
This final assignment is straightforward and consistent. wdyt?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
unit_tests/sources/declarative/schema/test_dynamic_schema_loader.py (1)
248-250
: How about making the transformation flow more explicit in the test?The test correctly validates the transformation pipeline, but we could make it more self-documenting. What do you think about:
Adding comments to explain the transformation flow:
- PascalCase in schema response (
FirstName
)- Gets transformed to snake_case (
first_name
)- Static field gets added (
static_field
)Maybe split the test into smaller, focused test cases for each transformation? This would make it easier to maintain and debug, wdyt? 🤔
Also applies to: 264-265, 276-278
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
unit_tests/sources/declarative/schema/test_dynamic_schema_loader.py
(4 hunks)
🔇 Additional comments (1)
unit_tests/sources/declarative/schema/test_dynamic_schema_loader.py (1)
67-81
: Consider adding more test cases for schema transformations?
The new schema transformations look good! However, we might want to strengthen our test coverage. What do you think about adding test cases for:
- Failed transformations
- Empty transformation lists
- Multiple AddFields transformations
- Invalid field definitions
This would help ensure robustness, wdyt? 🤔
What
During dynamic schema generation, we need to transform keys or add fields to enhance the schema's compatibility and functionality.
How
A schema transformations component has been integrated into the dynamic schema loader. This component is responsible for managing a list of transformations, enabling us to easily modify schema keys or insert additional fields as needed.