Alamb/suggestions #1

alamb · 2026-01-07T20:10:05Z

These are some code suggestions I had while reviewing the following PR in DataFusion

Make default ListingFilesCache table scoped apache/datafusion#19616

Merging this PR will update the PR in datafusion

I think it ok to merge apache#19616 without this PR and then merge it as a follow on PR in DataFusion

…e#19622) ## Which issue does this PR close? - Closes apache#19591 ## Rationale for this change Explained in issue itself ## What changes are included in this PR? - Removed coalesce batches rule - Deprecate `CoalesceBatchesExec` ## Are these changes tested? Yes ## Are there any user-facing changes? Yes, added a deprecation tag on `CoalesceBatchesExec`

… indexing (apache#19590) ## Which issue does this PR close?  ## Rationale for this change  This PR improves the performance of the `substring_index` function by optimizing delimiter search and substring extraction: - Single-byte fast path: introduces a specialized byte-based search for single-byte delimiters (e.g. `.`, `,`), avoiding UTF-8 pattern matching overhead. - Efficient index discovery: replaces the split-and-sum-length approach with direct index location using `match_indices` / `rmatch_indices`. ## What changes are included in this PR?  - Added a fast path for `delimiter.len() == 1` using byte-based search. - Refactored the general path to use `match_indices` and `rmatch_indices` for more efficient positioning. ### Benchmarks - Single-byte delimiter benchmarks show ~2–3× speedup across batch sizes. - Multi-byte delimiters see a consistent ~10–15% improvement. ``` group main_substrindex perf_substrindex ----- ---------------- ---------------- substr_index/substr_index_10000_long_delimiter 1.12 548.2±18.39µs ? ?/sec 1.00 488.4±15.62µs ? ?/sec substr_index/substr_index_10000_single_delimiter 2.14 543.4±15.12µs ? ?/sec 1.00 254.0±7.80µs ? ?/sec substr_index/substr_index_1000_long_delimiter 1.12 43.1±1.63µs ? ?/sec 1.00 38.6±2.03µs ? ?/sec substr_index/substr_index_1000_single_delimiter 3.51 46.4±2.21µs ? ?/sec 1.00 13.2±0.99µs ? ?/sec substr_index/substr_index_100_long_delimiter 1.01 3.7±0.18µs ? ?/sec 1.00 3.7±0.20µs ? ?/sec substr_index/substr_index_100_single_delimiter 2.15 3.6±0.14µs ? ?/sec 1.00 1675.9±79.15ns ? ?/sec ``` ## Are these changes tested? - Yes, Existing unit tests pass. - New benchmarks added to verify performance improvement.  ## Are there any user-facing changes? No.

## Which issue does this PR close? - Closes apache#19623 ## Rationale for this change Explained in issue ## What changes are included in this PR? Implement baseline metrics for `AsyncFuncExec` operator ## Are these changes tested? Have added a test with PR ## Are there any user-facing changes? There are user facing changes, now `explain analyze` will show metrics of `AsyncFuncExec` too, do we need to update any documentation regarding this? Asking cause I couldn't find anything on a quick search. --------- Co-authored-by: Yongting You <[email protected]>

## Which issue does this PR close?  - Part of apache#14763. ## Rationale for this change  ## What changes are included in this PR? Replace `TypeSignature::Exact` patterns with cleaner APIs: - `isnan/iszero`: `Signature::coercible` with `TypeSignatureClass::Float` - `nanvl`: `Signature::uniform(2, [Float32, Float64])`  ## Are these changes tested? Yes  ## Are there any user-facing changes?   --------- Co-authored-by: Jeffrey Vo <[email protected]>

…ructors (apache#19668) ### Which issue does this PR close? Closes apache#19664 --- ### Rationale for this change After apache#18880, `BatchPartitioner::try_new` gained additional parameters that are only relevant for round-robin repartitioning. This made the constructor API confusing, as hash repartitioning received parameters it does not use. Splitting the constructor improves clarity and avoids passing round-robin–specific parameters to hash partitioning. --- ### What changes are included in this PR? - Introduce `BatchPartitioner::try_new_hash` - Introduce `BatchPartitioner::try_new_round_robin` - Refactor callers to use the specialized constructors - Retain `BatchPartitioner::try_new` as a delegator for backward compatibility This is a pure refactor; behavior is unchanged. --- ### Are these changes tested? Yes. Existing tests cover this code path. All builds pass locally. --- ### Are there any user-facing changes? No. This change is internal only and does not affect user-facing behavior or APIs. --------- Co-authored-by: Your Name <[email protected]>

…ng decoding (apache#19545) ## Which issue does this PR close? * Closes apache#18560. ## Rationale for this change DataFusion’s Parquet row-level filter pushdown previously rejected all nested Arrow types (lists/structs), which prevented common and performance-sensitive filters on list columns (for example `array_has`, `array_has_all`, `array_has_any`) from being evaluated during Parquet decoding. Enabling safe pushdown for a small, well-defined set of list-aware predicates allows Parquet decoding to apply these filters earlier, reducing materialization work and improving scan performance, while still keeping unsupported nested projections (notably structs) evaluated after batches are materialized. ## What changes are included in this PR? * Allow a registry of list-aware predicates to be considered pushdown-compatible: * `array_has`, `array_has_all`, `array_has_any` * `IS NULL` / `IS NOT NULL` * Introduce `supported_predicates` module to detect whether an expression tree contains supported list predicates. * Update Parquet filter candidate selection to: * Accept list columns only when the predicate semantics are supported. * Continue rejecting struct columns (and other unsupported nested types). * Switch Parquet projection mask construction from root indices to **leaf indices** (`ProjectionMask::leaves`) so nested list filters project the correct leaf columns for decoding-time evaluation. * Expand root column indices to leaf indices for nested columns using the Parquet `SchemaDescriptor`. * Add unit tests verifying: * List columns are accepted for pushdown when used by supported predicates. * Struct columns (and mixed struct+primitive predicates) prevent pushdown. * `array_has`, `array_has_all`, `array_has_any` actually filter rows during decoding using a temp Parquet file. * Add sqllogictest coverage proving both correctness and plan behavior: * Queries return expected results. * `EXPLAIN` shows predicates pushed into `DataSourceExec` for Parquet. ## Are these changes tested? Yes. * Rust unit tests in `datafusion/datasource-parquet/src/row_filter.rs`: * Validate pushdown eligibility for list vs struct predicates. * Create a temp Parquet file and confirm list predicates prune/match the expected rows via Parquet decoding row filtering. * SQL logic tests in `datafusion/sqllogictest/test_files/parquet_filter_pushdown.slt`: * Add end-to-end coverage for `array_has`, `array_has_all`, `array_has_any` and combinations (OR / AND with other predicates). * Confirm pushdown appears in the physical plan (`DataSourceExec ... predicate=...`). ## Are there any user-facing changes? Yes. * Parquet filter pushdown now supports list columns for the following predicates: * `array_has`, `array_has_all`, `array_has_any` * `IS NULL`, `IS NOT NULL` This can improve query performance for workloads that filter on array/list columns. No breaking changes are introduced; unsupported nested types (for example structs) continue to be evaluated after decoding. ## LLM-generated code disclosure This PR includes LLM-generated code and comments. All LLM-generated content has been manually reviewed and tested. --------- Co-authored-by: Andrew Lamb <[email protected]> Co-authored-by: Qi Zhu <[email protected]>

…001` (apache#19666) ## Which issue does this PR close? - Part of apache#19656 ## Rationale for this change rust_decimal is a one person crate and is released somewhat infrequently * https://github.com/paupino/rust-decimal * https://crates.io/crates/rust_decimal It also uses a non trivial number of dependencies, including https://crates.io/crates/rkyv, some sort of zero copy deserialization framework that was recently subject to a RUSTSEC security advisory, see apache#19656 / apache#19657 Since `rust_decimal` is only used for sqllogictests to parse the results from postgres, we can pretty easily remove the dependency on `rust_decimal` and inline the very small amount functionality we need for sqllogictests This will both decrease the build time and our dependency trail. ## What changes are included in this PR? Removes the `rust_decimal` dependency from DataFusion and inlines the smallest required subset of decimal functionality we need for sqllogictests (which turns out to be pretty small) ## Are these changes tested? Yes by CI ## Are there any user-facing changes? No, this is all internal testing infrastructure

) (apache#19319) ## Which issue does this PR close?  - Closes #apache#19141. ## Rationale for this change  ## What changes are included in this PR?  ## Are these changes tested?  ## Are there any user-facing changes?   --------- Co-authored-by: Sergey Zhukov <[email protected]> Co-authored-by: Andrew Lamb <[email protected]>

… SimplifyInfo (apache#19505) In trying to fix apache#19418 I kept getting turned around about what was needed where. The `SimplifyInfo` trait made it extra hard to understand. I ended up realizing that the main reason for the trait to exist was tests. Removing the trait and adding a builder style API to `SimplifyContext` made it IMO more ergonomic for tests and other call sites, easier to track the code (no trait opaqueness) and clearer what simplification capabilities are available in each site. This got rid of e.g. some places where we were calling `ExecutionProps::new()` just to pass that into `SimplifyContext` which in turn would hand out references to the default query time, a default `ContigOptions`, etc; or in `datafusion/core/src/execution/session_state.rs` where we did `let dummy_schema = DFSchema::empty()`. This let me solve several problems: - Can store optimized logical plans for prepared statements - Users can optionally run an optimizer pass on logical plans without evaluating time functions Compared to apache#19426 this avoids adding a config option and is actually less lines of code (negative diff). Fixes apache#19418, closes apache#19426 (replaces it). --------- Co-authored-by: Claude Opus 4.5 <[email protected]>

## Which issue does this PR close?  - Part of apache#19025. ## Rationale for this change  ## What changes are included in this PR? - Added Time64/Time32 signatures to date_trunc - Added time truncation logic (hour, minute, second, millisecond, microsecond) - Error for invalid granularities (day, week, month, quarter, year)  ## Are these changes tested? Yes  ## Are there any user-facing changes?   --------- Co-authored-by: Jeffrey Vo <[email protected]>

## Which issue does this PR close?  - Closes #. ## Rationale for this change  `ParquetOpener::open()` is a critical function for parquet planning, it's the entry point for many major steps like row-group/file pruning. It has almost 400 lines of code now, this PR adds some markers to the code blocks/important steps, to make this function easier to navigate. (though I may have overlooked some critical steps) Ideally, we should break these blocks into utilities. I tried extracting some of them with AI, but the resulting utilities still have unclear semantics, with many input arguments and output items. Overall, the complexity doesn’t seem reduced after the change. I think it’s possible to factor them into helper functions with clear semantics, but that likely requires someone who understands the implementation details very well. ## What changes are included in this PR?  ## Are these changes tested?  ## Are there any user-facing changes?

## Which issue does this PR close? Closes apache#19347 ## Rationale for this change Native decimal log() (added in apache#17023) only supports integer bases. Non-integer bases like log(2.5, x) error out, which is a regression from the previous float-based implementation. ## What changes are included in this PR? Changes : - Fallback to f64 computation when base is non-integer - Integer bases (2, 10, etc.) still use efficient ilog() algorithm Refactoring: - Unified log_decimal32, log_decimal64, log_decimal128 into single generic log_decimal<T> using num_traits::ToPrimitive. - Used ToPrimitive::to_f64() and to_u128() - Invalid bases (≤1, NaN, Inf) now return NaN instead of erroring - matches f64::log behavior Large Decimal256 values that don't fit in i128 now work via f64 fallback. ## Are these changes tested? Yes: - All existing log_decimal* unit tests pass - Updated test_log_decimal128_invalid_base - expects NaN instead of error - Updated test_log_decimal256_large - now succeeds via fallback ## Are there any user-facing changes? Yes: ```sql -- Previously errored, now works SELECT log(2.5, 100::decimal(38,0)); -- Invalid base now returns NaN instead of error (consistent with float behavior) SELECT log(-2, 64::decimal(38,0)); -- Returns NaN --------- Co-authored-by: Jeffrey Vo <[email protected]>

…che#19369) ## Which issue does this PR close? Closes apache#19348 ## Rationale for this change Previously, pow() on decimal types would error for negative exponents and non-integer exponents with messages like: - Arrow error: Arithmetic overflow: Unsupported exp value: -5 - Compute error: Cannot use non-integer exp - This was a regression from when decimals were cast to float before pow(). The efficient integer-based algorithm for computing power on scaled integers cannot handle these cases. ## What changes are included in this PR? - Modified pow_decimal_int to fallback to pow_decimal_float for negative exponents - Modified pow_decimal_float to use an efficient integer path for non-negative integer exponents, otherwise fallback to f64 computation Added pow_decimal_float_fallback function that: - Converts the decimal to f64 - Computes powf(exp) - Converts back to the original decimal type with proper scaling - Added decimal_from_i128 helper to convert i128 results back to generic decimal types (needed for Decimal256 support) - Updated sqllogictests to expect success for negative/non-integer exponents ## Are these changes tested? Yes: Unit tests for pow_decimal_float_fallback covering negative exponents, fractional exponents, cube roots Updated SQL logic tests in decimal.slt ## Are there any user-facing changes? Yes. The following queries now work instead of returning errors: ```sql -- Negative exponent SELECT power(4::decimal(38, 5), -1); -- Returns 0.25 -- Non-integer exponent SELECT power(2.5, 4.2); -- Returns 46.9 -- Square root via power SELECT power(4::decimal, 0.5); -- Returns 2

## Which issue does this PR close?  - Builds on apache#19388 - Closes apache#19573 ## Rationale for this change  This PR explores one way to make `ListFilesCache` table scoped. A session level cache is still used, but the cache key is made a "table-scoped" path, for which a new struct ``` pub struct TableScopedPath(pub Option<TableReference>, pub Path); ``` is defined. `TableReference` comes from `CreateExternalTable` passed to `ListingTableFactory::create`. Additionally, when a table is dropped, all entries related to a table is dropped by modifying `SessionContext::find_and_deregister` method. Testing (change on adding `list_files_cache()` for cli is included for easier testing). - Testing cache reuse on a single table. ``` > \object_store_profiling summary ObjectStore Profile mode set to Summary > create external table test stored as parquet location 's3://overturemaps-us-west-2/release/2025-12-17.0/theme=base/'; 0 row(s) fetched. Elapsed 14.878 seconds. Object Store Profiling Instrumented Object Store: instrument_mode: Summary, inner: AmazonS3(overturemaps-us-west-2) Summaries: +-----------+----------+-----------+-----------+-------------+-------------+-------+ | Operation | Metric | min | max | avg | sum | count | +-----------+----------+-----------+-----------+-------------+-------------+-------+ | Get | duration | 0.030597s | 0.209235s | 0.082396s | 36.254189s | 440 | | Get | size | 204782 B | 857230 B | 497304.88 B | 218814144 B | 440 | | List | duration | 0.192037s | 0.192037s | 0.192037s | 0.192037s | 1 | | List | size | | | | | 1 | +-----------+----------+-----------+-----------+-------------+-------------+-------+ > select table, path, unnest(metadata_list) from list_files_cache() limit 1; +-------+---------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | table | path | UNNEST(list_files_cache().metadata_list) | +-------+---------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | test | release/2025-12-17.0/theme=base | {file_path: release/2025-12-17.0/theme=base/type=bathymetry/part-00000-dd0f2f50-b436-4710-996f-f1b06181a3a1-c000.zstd.parquet, file_modified: 2025-12-17T22:32:50, file_size_bytes: 40280159, e_tag: "15090401f8f936c3f83bb498cb99a41d-3", version: NULL} | +-------+---------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ 1 row(s) fetched. Elapsed 0.058 seconds. Object Store Profiling > select count(*) from test where type = 'infrastructure'; +-----------+ | count(*) | +-----------+ | 142969564 | +-----------+ 1 row(s) fetched. Elapsed 0.028 seconds. Object Store Profiling ``` - Test separate cache entries are created for two tables with same path ``` > create external table test2 stored as parquet location 's3://overturemaps-us-west-2/release/2025-12-17.0/theme=base/'; 0 row(s) fetched. Elapsed 14.798 seconds. Object Store Profiling Instrumented Object Store: instrument_mode: Summary, inner: AmazonS3(overturemaps-us-west-2) Summaries: +-----------+----------+-----------+-----------+-------------+-------------+-------+ | Operation | Metric | min | max | avg | sum | count | +-----------+----------+-----------+-----------+-------------+-------------+-------+ | Get | duration | 0.030238s | 0.350465s | 0.073670s | 32.414654s | 440 | | Get | size | 204782 B | 857230 B | 497304.88 B | 218814144 B | 440 | | List | duration | 0.133334s | 0.133334s | 0.133334s | 0.133334s | 1 | | List | size | | | | | 1 | +-----------+----------+-----------+-----------+-------------+-------------+-------+ > select count(*) from test2 where type = 'bathymetry'; +----------+ | count(*) | +----------+ | 59963 | +----------+ 1 row(s) fetched. Elapsed 0.009 seconds. Object Store Profiling > select table, path from list_files_cache(); +-------+---------------------------------+ | table | path | +-------+---------------------------------+ | test | release/2025-12-17.0/theme=base | | test2 | release/2025-12-17.0/theme=base | +-------+---------------------------------+ 2 row(s) fetched. Elapsed 0.004 seconds. ``` - Test cache associated with a table is dropped when table is dropped, and the other table with same path is unaffected. ``` > drop table test; 0 row(s) fetched. Elapsed 0.015 seconds. Object Store Profiling > select table, path from list_files_cache(); +-------+---------------------------------+ | table | path | +-------+---------------------------------+ | test2 | release/2025-12-17.0/theme=base | +-------+---------------------------------+ 1 row(s) fetched. Elapsed 0.005 seconds. Object Store Profiling > select count(*) from list_files_cache() where table = 'test'; +----------+ | count(*) | +----------+ | 0 | +----------+ 1 row(s) fetched. Elapsed 0.014 seconds. > select count(*) from test2 where type = 'infrastructure'; +-----------+ | count(*) | +-----------+ | 142969564 | +-----------+ 1 row(s) fetched. Elapsed 0.013 seconds. Object Store Profiling ``` - Test that dropping a view does not remove cache ``` > create view test2_view as (select * from test2 where type = 'infrastructure'); 0 row(s) fetched. Elapsed 0.103 seconds. Object Store Profiling > select count(*) from test2_view; +-----------+ | count(*) | +-----------+ | 142969564 | +-----------+ 1 row(s) fetched. Elapsed 0.094 seconds. Object Store Profiling > drop view test2_view; 0 row(s) fetched. Elapsed 0.002 seconds. Object Store Profiling > select table, path from list_files_cache(); +-------+---------------------------------+ | table | path | +-------+---------------------------------+ | test2 | release/2025-12-17.0/theme=base | +-------+---------------------------------+ 1 row(s) fetched. Elapsed 0.007 seconds. ``` ## What changes are included in this PR?  ## Are these changes tested?  ## Are there any user-facing changes?

alamb · 2026-01-07T20:10:28Z

datafusion-cli/src/functions.rs


            for (path, entry) in list_files_cache.list_entries() {
-                table_arr.push(path.table.map_or("NULL".to_string(), |t| t.to_string()));
+                table_arr.push(path.table.map(|t| t.to_string()));


it might be nicer to us actual nulls rather than a "NULL" string here

alamb · 2026-01-07T20:10:47Z

datafusion/core/src/datasource/listing_table_factory.rs

-                    table_path = table_path
-                        .with_glob(glob.as_ref())?
-                        .with_table_ref(cmd.name.clone());
+                    table_path = table_path.with_glob(glob.as_ref())?;


I updated with_glob to preserve the other fields too

alamb · 2026-01-07T20:11:18Z

datafusion/datasource/src/url.rs

-        Self::try_new(self.url, Some(glob))
+    pub fn with_glob(mut self, glob: &str) -> Result<Self> {
+        self.glob =
+            Some(Pattern::new(glob).map_err(|e| DataFusionError::External(Box::new(e)))?);


self::try_new doesn't do anything to the url, so there is no need to reparse the url

alamb · 2026-01-08T14:47:19Z

Superceded by apache#19703

feniljain and others added 7 commits January 6, 2026 20:55

github-actions bot added execution datasource core labels Jan 7, 2026

alamb mentioned this pull request Jan 7, 2026

Make default ListingFilesCache table scoped apache/datafusion#19616

Merged

cj-zhukov and others added 7 commits January 8, 2026 00:23

alamb commented Jan 8, 2026

View reviewed changes

alamb added 4 commits January 8, 2026 09:35

Add doc comments to TableScopedPath

aba7456

more comments

7e1ca8f

Use nullable strings rather than NULL

8ac2922

Improve ListingTableUrl

8a9bf43

alamb force-pushed the alamb/suggestions branch from c5d173e to 8a9bf43 Compare January 8, 2026 14:35

github-actions bot added documentation Improvements or additions to documentation development-process logical-expr physical-plan catalog common labels Jan 8, 2026

github-actions bot added functions optimizer proto spark sqllogictest labels Jan 8, 2026

alamb closed this Jan 8, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Alamb/suggestions #1

Alamb/suggestions #1

Uh oh!

alamb commented Jan 7, 2026

Uh oh!

alamb Jan 7, 2026

Uh oh!

alamb Jan 7, 2026

Uh oh!

alamb Jan 7, 2026

Uh oh!

alamb commented Jan 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

11 participants

Alamb/suggestions #1

Alamb/suggestions #1

Uh oh!

Conversation

alamb commented Jan 7, 2026

Uh oh!

alamb Jan 7, 2026

Choose a reason for hiding this comment

Uh oh!

alamb Jan 7, 2026

Choose a reason for hiding this comment

Uh oh!

alamb Jan 7, 2026

Choose a reason for hiding this comment

Uh oh!

alamb commented Jan 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

11 participants