-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(test): Allow mock server tests for manifest only sources #371
base: main
Are you sure you want to change the base?
feat(test): Allow mock server tests for manifest only sources #371
Conversation
/autofix
|
📝 WalkthroughWalkthroughThis PR modifies two methods within the declarative sources. One method in the component factory now includes a fallback mechanism for importing modules when a Changes
Sequence Diagram(s)sequenceDiagram
participant Caller
participant Factory as ModelToComponentFactory
Caller->>Factory: _get_class_from_fully_qualified_class_name(name)
alt Import succeeds
Factory->>Caller: Return class
else Import fails (ModuleNotFoundError)
Factory->>Factory: Check if name prefix == "source_declarative_manifest"
alt Is "source_declarative_manifest"
Factory->>Factory: Attempt modified import (without first segment)
alt Modified import succeeds
Factory->>Caller: Return class
else Modified import fails
Factory->>Caller: Raise ValueError with original error context
end
else
Factory->>Caller: Raise ValueError with original error context
end
end
sequenceDiagram
participant Caller
participant YamlSource as YamlDeclarativeSource
participant FileSys as File System
participant PkgUtil as pkgutil
Caller->>YamlSource: _read_and_parse_yaml_file(file_path)
alt Direct file read attempt
YamlSource->>FileSys: Open file at file_path
alt File exists
FileSys-->>YamlSource: File content
YamlSource->>YamlSource: Parse content with yaml.safe_load
YamlSource->>Caller: Return ConnectionDefinition
else File not found (FileNotFoundError)
YamlSource->>PkgUtil: pkgutil.get_data(...)
PkgUtil-->>YamlSource: YAML data
YamlSource->>YamlSource: Parse fetched data with yaml.safe_load
YamlSource->>Caller: Return ConnectionDefinition
end
end
Possibly related PRs
Suggested reviewers
✨ Finishing Touches
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (2)
airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py (1)
1408-1418
: Consider removing unused import.
import os
is never referenced within this snippet. Would you like to remove it for clarity, wdyt?- try: - import os module_name_with_source_declarative_manifest = ".".join(split[1:-1]) module_ref = importlib.import_module(module_name_with_source_declarative_manifest)airbyte_cdk/sources/declarative/yaml_declarative_source.py (1)
47-53
: Raise an exception instead of returning an empty dictionary?When
pkgutil.get_data
returnsNone
, this code now returns{}
. Would you prefer to raise an error to alert users that the YAML was not found or is empty, wdyt?
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py
(1 hunks)airbyte_cdk/sources/declarative/yaml_declarative_source.py
(1 hunks)
🧰 Additional context used
🧠 Learnings (1)
airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py (1)
Learnt from: aaronsteers
PR: airbytehq/airbyte-python-cdk#174
File: airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py:1093-1102
Timestamp: 2025-01-14T00:20:32.310Z
Learning: In the `airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py` file, the strict module name checks in `_get_class_from_fully_qualified_class_name` (requiring `module_name` to be "components" and `module_name_full` to be "source_declarative_manifest.components") are intentionally designed to provide early, clear feedback when class declarations won't be found later in execution. These restrictions may be loosened in the future if the requirements for class definition locations change.
🪛 GitHub Actions: Linters
airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py
[error] 1-1: Ruff formatting check failed. 1 file would be reformatted. Please run 'ruff format' to fix code style issues in this file.
⏰ Context from checks skipped due to timeout of 90000ms (8)
- GitHub Check: Check: 'source-pokeapi' (skip=false)
- GitHub Check: Check: 'source-amplitude' (skip=false)
- GitHub Check: Check: 'source-shopify' (skip=false)
- GitHub Check: Check: 'source-hardcoded-records' (skip=false)
- GitHub Check: Pytest (All, Python 3.11, Ubuntu)
- GitHub Check: Pytest (All, Python 3.10, Ubuntu)
- GitHub Check: Analyze (python)
- GitHub Check: Pytest (Fast)
🔇 Additional comments (1)
airbyte_cdk/sources/declarative/yaml_declarative_source.py (1)
42-46
: Add explicit YAML format validation?Would you like to add error handling for invalid YAML formats in case the file exists but contains malformed content, wdyt? This might catch parse issues earlier.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
APPROVED
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (2)
airbyte_cdk/sources/declarative/yaml_declarative_source.py (2)
46-48
: The comment could provide more clarity on expected behaviorThe comment explains why the fallback is needed but is a bit vague. Maybe we could expand it slightly to explain the expected flow for both development/testing vs container runtime environments?
Something like: "In container environments, YAML files are accessed through the package, while in dev/test environments we prefer direct file access."
What do you think?
42-54
: Consider handling additional file error typesRight now we're only catching
FileNotFoundError
, but other exceptions could occur when opening files (permission issues, encoding problems). Would it make sense to catch a broader set of exceptions likeIOError
or specific ones likePermissionError
as well?Also, there's no explicit error handling for YAML parsing errors in either approach. Perhaps adding some more specific error handling would improve debugging?
try: # For testing purposes, we want to allow to just pass a file with open(path_to_yaml_file, "r") as f: return yaml.safe_load(f) # type: ignore # we assume the yaml represents a ConnectionDefinition -except FileNotFoundError: +except (FileNotFoundError, PermissionError, IOError) as e: + self.logger.debug(f"Could not open file directly due to: {str(e)}. Trying package lookup.") # Running inside the container, the working directory during an operation is not structured the same as the static files package = self.__class__.__module__.split(".")[0]
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
airbyte_cdk/sources/declarative/yaml_declarative_source.py
(1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (8)
- GitHub Check: Check: 'source-pokeapi' (skip=false)
- GitHub Check: Check: 'source-amplitude' (skip=false)
- GitHub Check: Check: 'source-shopify' (skip=false)
- GitHub Check: Check: 'source-hardcoded-records' (skip=false)
- GitHub Check: Pytest (All, Python 3.11, Ubuntu)
- GitHub Check: Pytest (All, Python 3.10, Ubuntu)
- GitHub Check: Pytest (Fast)
- GitHub Check: Analyze (python)
🔇 Additional comments (2)
airbyte_cdk/sources/declarative/yaml_declarative_source.py (2)
42-46
: This change makes testing manifest-only sources easierThe new approach first attempts to open the YAML file directly, which is simpler and more intuitive for testing purposes. This helps debug manifest-only sources by making the file accessible without requiring it to be part of a package.
Just curious - have you considered adding a debug log message here to indicate when files are found directly vs via the package fallback? It might help with future debugging, wdyt?
50-53
: What ifyaml_config
is None but the file exists?I notice that the
get_data
call could potentially returnNone
even if the path exists (for example, if the package structure is incorrect or there are permission issues). The code handles this with the conditionif yaml_config:
, but we might want to add some logging to help diagnose the issue.Would it be helpful to add a debug or warning log message if
yaml_config
is None? This could make debugging easier when neither approach finds the file.
What
Currently, it is really hard to debug manifest only connectors and it is impossible to add a safety net using mock server tests
How
Allowing for yaml files to be provided not as part of the package
Removing
source_declarative_manifest
from the package for custom componentsMore context
Demo: https://airbytehq-team.slack.com/archives/C02U9R3AF37/p1740701852255509?thread_ts=1740697205.997469&cid=C02U9R3AF37
Example of usage: airbytehq/airbyte#54715
Summary by CodeRabbit