Skip to content

chore: datafusion 52.1.0#1174

Merged
linhr merged 29 commits into
mainfrom
df-52
Jan 27, 2026
Merged

chore: datafusion 52.1.0#1174
linhr merged 29 commits into
mainfrom
df-52

Conversation

@lonless9
Copy link
Copy Markdown
Contributor

@lonless9 lonless9 commented Jan 5, 2026

@github-actions
Copy link
Copy Markdown

github-actions Bot commented Jan 5, 2026

Gold Data Report

Notes
  1. The tables below show the number of true positives (TP), true negatives (TN), false positives (FP), and false negatives (FN) in gold data input processing.
  2. A positive input is a valid test case, while a negative input is a test case that is expected to fail.

Commit Information

Commit Revision Branch
After f52e8f4 refs/pull/1174/merge
Before f23ce29 main

Summary

Commit TP TN FP FN Total
After 1918 201 45 456 2620
Before 1916 201 45 458 2620

Details

Gold Data Metrics
Group File Commit TP TN FP FN Total
spark data_type.json After 45 12 1 6 64
Before 45 12 1 6 64
expression/case.json After 5 0 0 0 5
Before 5 0 0 0 5
expression/cast.json After 4 0 0 0 4
Before 4 0 0 0 4
expression/current.json After 3 0 0 0 3
Before 3 0 0 0 3
expression/date.json After 4 0 1 0 5
Before 4 0 1 0 5
expression/interval.json After 346 4 1 0 351
Before 346 4 1 0 351
expression/large.json After 2 0 0 0 2
Before 2 0 0 0 2
expression/like.json After 29 10 0 0 39
Before 29 10 0 0 39
expression/misc.json After 109 6 0 3 118
Before 109 6 0 3 118
expression/numeric.json After 31 6 1 0 38
Before 31 6 1 0 38
expression/string.json After 18 1 0 0 19
Before 18 1 0 0 19
expression/timestamp.json After 7 0 3 0 10
Before 7 0 3 0 10
expression/window.json After 73 0 1 0 74
Before 73 0 1 0 74
function/agg.json After 142 0 0 44 186
Before 142 0 0 44 186
function/array.json After 44 0 0 0 44
Before 44 0 0 0 44
function/bitwise.json After 15 0 0 0 15
Before 15 0 0 0 15
function/collection.json After 12 0 0 0 12
Before 12 0 0 0 12
function/conditional.json After 15 0 0 0 15
Before 15 0 0 0 15
function/conversion.json After 2 0 0 0 2
Before 2 0 0 0 2
function/csv.json After 2 0 0 3 5
Before 2 0 0 3 5
function/datetime.json After 121 0 0 59 180
Before 119 0 0 61 180
function/generator.json After 7 0 0 6 13
Before 7 0 0 6 13
function/hash.json After 5 0 0 2 7
Before 5 0 0 2 7
function/json.json After 16 0 0 6 22
Before 16 0 0 6 22
function/lambda.json After 1 0 0 30 31
Before 1 0 0 30 31
function/map.json After 11 0 0 0 11
Before 11 0 0 0 11
function/math.json After 123 0 0 1 124
Before 123 0 0 1 124
function/misc.json After 31 0 0 43 74
Before 31 0 0 43 74
function/predicate.json After 70 0 0 9 79
Before 70 0 0 9 79
function/st.json After 0 0 0 7 7
Before 0 0 0 7 7
function/string.json After 159 0 0 46 205
Before 159 0 0 46 205
function/struct.json After 2 0 0 0 2
Before 2 0 0 0 2
function/url.json After 9 0 0 1 10
Before 9 0 0 1 10
function/variant.json After 0 0 0 28 28
Before 0 0 0 28 28
function/window.json After 6 0 0 3 9
Before 6 0 0 3 9
function/xml.json After 0 0 0 17 17
Before 0 0 0 17 17
plan/ddl_alter_table.json After 49 14 3 11 77
Before 49 14 3 11 77
plan/ddl_alter_view.json After 5 1 0 0 6
Before 5 1 0 0 6
plan/ddl_analyze_table.json After 17 6 0 0 23
Before 17 6 0 0 23
plan/ddl_cache.json After 4 0 1 0 5
Before 4 0 1 0 5
plan/ddl_create_index.json After 0 0 0 3 3
Before 0 0 0 3 3
plan/ddl_create_table.json After 27 30 8 40 105
Before 27 30 8 40 105
plan/ddl_delete_from.json After 2 1 0 0 3
Before 2 1 0 0 3
plan/ddl_describe.json After 4 0 0 0 4
Before 4 0 0 0 4
plan/ddl_drop_index.json After 0 0 0 2 2
Before 0 0 0 2 2
plan/ddl_drop_view.json After 5 0 0 0 5
Before 5 0 0 0 5
plan/ddl_insert_into.json After 16 1 1 0 18
Before 16 1 1 0 18
plan/ddl_insert_overwrite.json After 9 0 2 0 11
Before 9 0 2 0 11
plan/ddl_load_data.json After 4 0 0 0 4
Before 4 0 0 0 4
plan/ddl_merge_into.json After 8 4 3 0 15
Before 8 4 3 0 15
plan/ddl_misc.json After 9 0 0 1 10
Before 9 0 0 1 10
plan/ddl_replace_table.json After 23 14 7 40 84
Before 23 14 7 40 84
plan/ddl_select.json After 1 0 0 0 1
Before 1 0 0 0 1
plan/ddl_show_views.json After 7 0 0 0 7
Before 7 0 0 0 7
plan/ddl_uncache.json After 2 0 0 0 2
Before 2 0 0 0 2
plan/ddl_update.json After 2 1 0 0 3
Before 2 1 0 0 3
plan/error_alter_table.json After 0 2 0 0 2
Before 0 2 0 0 2
plan/error_analyze_table.json After 0 1 0 0 1
Before 0 1 0 0 1
plan/error_create_table.json After 0 6 0 0 6
Before 0 6 0 0 6
plan/error_describe.json After 0 1 0 0 1
Before 0 1 0 0 1
plan/error_join.json After 0 2 0 0 2
Before 0 2 0 0 2
plan/error_load_data.json After 0 1 0 0 1
Before 0 1 0 0 1
plan/error_misc.json After 0 14 0 0 14
Before 0 14 0 0 14
plan/error_order_by.json After 1 4 0 0 5
Before 1 4 0 0 5
plan/error_select.json After 0 15 0 0 15
Before 0 15 0 0 15
plan/error_with.json After 0 1 0 0 1
Before 0 1 0 0 1
plan/plan_alter_view.json After 0 2 0 0 2
Before 0 2 0 0 2
plan/plan_create_view.json After 0 2 0 0 2
Before 0 2 0 0 2
plan/plan_explain.json After 0 1 1 0 2
Before 0 1 1 0 2
plan/plan_group_by.json After 9 1 0 1 11
Before 9 1 0 1 11
plan/plan_hint.json After 25 0 3 0 28
Before 25 0 3 0 28
plan/plan_insert_into.json After 3 0 0 0 3
Before 3 0 0 0 3
plan/plan_insert_overwrite.json After 2 0 0 0 2
Before 2 0 0 0 2
plan/plan_join.json After 53 2 1 6 62
Before 53 2 1 6 62
plan/plan_misc.json After 15 4 0 10 29
Before 15 4 0 10 29
plan/plan_order_by.json After 15 5 1 10 31
Before 15 5 1 10 31
plan/plan_select.json After 83 14 5 18 120
Before 83 14 5 18 120
plan/plan_set_operation.json After 17 0 0 0 17
Before 17 0 0 0 17
plan/plan_with.json After 6 0 1 0 7
Before 6 0 1 0 7
plan/unpivot_join.json After 4 0 0 0 4
Before 4 0 0 0 4
plan/unpivot_select.json After 14 6 0 0 20
Before 14 6 0 0 20
table_schema.json After 8 6 0 0 14
Before 8 6 0 0 14

@codecov
Copy link
Copy Markdown

codecov Bot commented Jan 5, 2026

Codecov Report

❌ Patch coverage is 55.68544% with 417 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
crates/sail-execution/src/codec.rs 0.00% 99 Missing ⚠️
.../sail-delta-lake/src/physical_plan/expr_adapter.rs 62.80% 90 Missing ⚠️
crates/sail-cache/src/file_listing_cache.rs 24.52% 80 Missing ⚠️
...ates/sail-data-source/src/formats/binary/reader.rs 66.66% 22 Missing ⚠️
crates/sail-cache/src/file_statistics_cache.rs 0.00% 20 Missing ⚠️
...es/sail-telemetry/src/execution/metrics/testing.rs 0.00% 16 Missing ⚠️
crates/sail-physical-plan/src/streaming/filter.rs 84.14% 13 Missing ⚠️
crates/sail-iceberg/src/datasource/expr_adapter.rs 47.36% 10 Missing ⚠️
crates/sail-logical-plan/src/streaming/filter.rs 74.28% 9 Missing ⚠️
crates/sail-data-source/src/formats/text/source.rs 82.22% 8 Missing ⚠️
... and 17 more
@@            Coverage Diff             @@
##             main    #1174      +/-   ##
==========================================
- Coverage   72.76%   71.61%   -1.16%     
==========================================
  Files         796      798       +2     
  Lines       91703    93457    +1754     
==========================================
+ Hits        66732    66925     +193     
- Misses      24971    26532    +1561     
Flag Coverage Δ *Carryforward flag
ibis-tests 19.88% <ø> (+0.07%) ⬆️ Carriedforward from d4583fe
python-unit-tests 53.62% <56.30%> (-0.10%) ⬇️
rust-slow-tests 37.81% <ø> (+0.08%) ⬆️ Carriedforward from d4583fe
rust-unit-tests 33.59% <12.53%> (-0.53%) ⬇️
spark-tests 34.30% <ø> (+0.07%) ⬆️ Carriedforward from d4583fe

*This pull request uses carry forward flags. Click here to find out more.

Files with missing lines Coverage Δ
crates/sail-cache/src/file_metadata_cache.rs 57.97% <100.00%> (ø)
...sail-data-source/src/formats/binary/file_format.rs 42.85% <100.00%> (-9.65%) ⬇️
crates/sail-data-source/src/formats/rate/reader.rs 82.96% <ø> (ø)
...s/sail-data-source/src/formats/text/file_format.rs 52.30% <100.00%> (-0.33%) ⬇️
crates/sail-data-source/src/formats/text/writer.rs 62.13% <ø> (-3.18%) ⬇️
crates/sail-data-source/src/listing.rs 83.25% <100.00%> (-10.89%) ⬇️
crates/sail-delta-lake/src/physical_plan/mod.rs 88.88% <ø> (ø)
...l-delta-lake/src/physical_plan/planner/log_scan.rs 76.19% <100.00%> (-15.73%) ⬇️
...tion/src/scalar/datetime/spark_try_to_timestamp.rs 65.90% <100.00%> (-0.76%) ⬇️
...nction/src/scalar/datetime/spark_unix_timestamp.rs 66.00% <100.00%> (ø)
... and 31 more

... and 5 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Comment on lines +175 to +179
if let Some(lfc) = ctx.runtime_env().cache_manager.get_list_files_cache() {
for table_path in &table_paths {
let _ = lfc.remove(table_path.prefix());
}
}
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Haven't delved into it deeply, but it seems related to this upstream issue:
apache/datafusion#19573

@lonless9 lonless9 changed the title chore: datafusion 52 chore: datafusion 52.1.0 Jan 26, 2026
Comment on lines +149 to +155
fn update_cache_limit(&self, _limit: usize) {
// TODO: support dynamic update of cache limit
}

fn update_cache_ttl(&self, _ttl: Option<Duration>) {
// TODO: support dynamic update of cache ttl
}
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moka likely does not support this option, and refactoring here to use locks for protection doesn't seem like a good idea either. So I guess we can leave it as a TODO for now.

rules.push(limit_push_past_windows());
rules.push(Arc::new(LimitPushdown::new()));
rules.push(Arc::new(ProjectionPushdown::new()));
rules.push(Arc::new(PushdownSort::new()));
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rules.push(Arc::new(OptimizeAggregateOrder::new()));
rules.push(Arc::new(ProjectionPushdown::new()));
rules.push(Arc::new(CoalesceBatches::new()));
rules.push(Arc::new(CoalesceAsyncExecInput::new()));
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment thread Cargo.toml
datafusion-functions = { version = "52.1.0" }
datafusion-functions-nested = { version = "52.1.0" }
datafusion-physical-expr = { version = "52.1.0" }
datafusion-session = { version = "52.1.0" }
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I plan to later change all imports of session from datafusion-catalog to imports from datafusion-session.

@lonless9 lonless9 requested a review from linhr January 26, 2026 11:32
@lonless9 lonless9 marked this pull request as ready for review January 26, 2026 11:32
Copy link
Copy Markdown
Contributor

@linhr linhr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work! 🚀

Comment thread crates/sail-telemetry/src/execution/metrics/default.rs
Comment thread python/pysail/tests/spark/test_write_table.py
Comment thread crates/sail-execution/src/plan/shuffle_write.rs
Comment on lines +11 to +13
/// Unlike a regular `Filter` node, this node is used in streaming plan rewriting
/// to avoid DataFusion optimizer rules (e.g. repartition insertion) that can make
/// bounded streaming queries unexpectedly slow.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting. I'll probably look into it deeper to understand what's going on, but I think this extension is good to have for now.

Comment thread crates/sail-data-source/src/formats/text/options.rs
Comment thread crates/sail-cache/src/file_listing_cache.rs
Comment thread crates/sail-execution/src/codec.rs
Comment thread crates/sail-data-source/src/formats/binary/source.rs
@linhr linhr merged commit 0d9bfaa into main Jan 27, 2026
32 checks passed
@linhr linhr deleted the df-52 branch January 27, 2026 09:38
@lonless9
Copy link
Copy Markdown
Contributor Author

Okay, I've handle some of the comments in #1255

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants