-
Notifications
You must be signed in to change notification settings - Fork 2.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Python: Table scan returning deleted data #6568
Labels
Milestone
Comments
Fokko
changed the title
pyiceberg table scan returning deleted data
Python: Table scan returning deleted data
Jan 12, 2023
It looks like the delete strategy is merge-on-read (more information on the strategies can be found here), and this requires PyIceberg to filter out the deleted rows when reading the data. |
Fokko
added a commit
to Fokko/iceberg
that referenced
this issue
Feb 8, 2023
Fokko
added a commit
to Fokko/iceberg
that referenced
this issue
Feb 8, 2023
Fokko
added a commit
to Fokko/iceberg
that referenced
this issue
Feb 8, 2023
Fokko
added a commit
to Fokko/iceberg
that referenced
this issue
Feb 8, 2023
Fokko
added a commit
to Fokko/iceberg
that referenced
this issue
Feb 8, 2023
Fokko
added a commit
to Fokko/iceberg
that referenced
this issue
Feb 9, 2023
Fokko
added a commit
to Fokko/iceberg
that referenced
this issue
Feb 9, 2023
Fokko
added a commit
to Fokko/iceberg
that referenced
this issue
Feb 9, 2023
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Apache Iceberg version
1.1.0 (latest release)
Query engine
Other
Please describe the bug 🐞
I originally mentioned raised this issue in #6567. After deleting rows from a table (in my case with Athena), pyiceberg is still returning parquet files with those records from a table scan. Shouldn't those files no longer be in the current manifest and hence not returned my the table scan?
The text was updated successfully, but these errors were encountered: