Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf: avoid arrow IPC & unneeded steps when flushing delta #247

Merged
merged 8 commits into from
Dec 4, 2024

Conversation

fanyang01
Copy link
Collaborator

@fanyang01 fanyang01 commented Dec 3, 2024

This PR updates the approach for passing the delta buffer from Go to DuckDB. Previously, we used the scan_arrow_ipc function provided by DuckDB's Arrow extension (https://github.com/duckdb/arrow), which required serializing Arrow arrays into Arrow IPC buffers.

We now switch to go-duckdb's RegisterView API (introduced in marcboeker/go-duckdb#283), enabling zero-copy ingestion for improved efficiency.

However, I observed that this API does not work as expected when only a subset of columns is needed on the DuckDB side. This issue might be related to a bug in DuckDB's projection pushdown implementation for Arrow data. As a workaround, I have implemented column pruning in advance to exclude unnecessary columns.

@fanyang01 fanyang01 requested a review from Copilot December 3, 2024 08:52

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot reviewed 5 out of 5 changed files in this pull request and generated no suggestions.

Comments skipped due to low confidence (1)

binlogreplication/binlog_replica_applier.go:1248

  • The new behavior introduced by the 'conn' parameter should be covered by tests to ensure it works as expected.
conn, err := adapter.GetConn(ctx)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot reviewed 6 out of 8 changed files in this pull request and generated no suggestions.

Files not reviewed (2)
  • binlogreplication/writer.go: Evaluated as low risk
  • replica/replication.go: Evaluated as low risk
@fanyang01 fanyang01 merged commit d93ac9a into main Dec 4, 2024
10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant