-
Notifications
You must be signed in to change notification settings - Fork 297
fix: sql/spark read_iceberg and read_deltalake #5035
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Greptile Summary
This PR fixes import path issues in the Rust scan builder code for Delta Lake and Iceberg table formats. The changes update Python module import paths from the old locations (daft.delta_lake.delta_lake_scan
and daft.iceberg.iceberg_scan
) to their new locations under the daft.io
package (daft.io.delta_lake.delta_lake_scan
and daft.io.iceberg.iceberg_scan
).
The fix addresses a module restructuring that moved I/O-related components under the daft.io
package structure. The Rust FFI (Foreign Function Interface) layer in src/daft-scan/src/builder.rs
uses PyO3 to import and interact with Python modules for scan operations. When the Python modules were relocated but the Rust import paths weren't updated, it caused ModuleNotFoundError
exceptions during runtime when using Spark Connect with these table formats.
This change aligns the Rust code with the current Python package structure, ensuring that the scan builder can successfully import and instantiate the appropriate Python scan operators for both Delta Lake and Iceberg table formats. The fix is surgical - only updating the two import statements without modifying any business logic or functionality.
Confidence score: 5/5
- This PR is safe to merge with minimal risk
- Score reflects simple import path corrections with no logic changes in non-critical refactoring
- No files require special attention
1 file reviewed, no comments
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I want to add some tests and I noticed that we do have a sql read_iceberg test but it seems like it is skipped. @rchowell wondering if there is more context here about testing these features.
@kevinzwang this just wasn't tested at the time, only a manual test 🙃 #3701 |
I'll take a look at adding something that can be run in CI then |
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #5035 +/- ##
==========================================
+ Coverage 76.04% 76.50% +0.45%
==========================================
Files 945 945
Lines 129743 129741 -2
==========================================
+ Hits 98669 99255 +586
+ Misses 31074 30486 -588
🚀 New features to boost your workflow:
|
Changes Made
Resolves #5024
Related Issues
Checklist
docs/mkdocs.yml
navigation