Skip to content

fix: Resolve null literal in array_split_into_chunks via generic registration#16923

Open
allenshen13 wants to merge 2 commits intofacebookincubator:mainfrom
allenshen13:fix-array-split-into-chunks-null-handling
Open

fix: Resolve null literal in array_split_into_chunks via generic registration#16923
allenshen13 wants to merge 2 commits intofacebookincubator:mainfrom
allenshen13:fix-array-split-into-chunks-null-handling

Conversation

@allenshen13
Copy link
Copy Markdown
Collaborator

@allenshen13 allenshen13 commented Mar 25, 2026

Fixes: prestodb/presto#27429

Summary

Sidecar overload resolution fails on array_split_into_chunks(null, 2):

Could not choose a best candidate operator. Explicit type casts must be added.                                                                            
Candidates are:                                                                                                                                           
  * native.default.array_split_into_chunks(array(double),integer):...                                                                                     
  * native.default.array_split_into_chunks(array(real),integer):...                                                                                       
  ... (12 total)                                                                                                                                          
  • Collapse the 12 per-type registrations into a single
    registerArraySplitIntoChunksFunctions<Generic<T1>>(prefix). A generic
    registration lets the null literal resolve unambiguously and covers
    all element types.

Expected CI failures (not blockers)

  • Signature Changes: flags the 12 removed per-type signatures.
    This is the intended change — Generic<T1> replaces them. Please
    acknowledge on the PR.

@meta-cla meta-cla Bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Mar 25, 2026
@netlify
Copy link
Copy Markdown

netlify Bot commented Mar 25, 2026

Deploy Preview for meta-velox canceled.

Name Link
🔨 Latest commit cbf0f5a
🔍 Latest deploy log https://app.netlify.com/projects/meta-velox/deploys/69f128f00213220008609cf8

@allenshen13 allenshen13 changed the title Fix: handling literal null value to match Java fix: handling literal null value to match Java Mar 25, 2026
@allenshen13 allenshen13 changed the title fix: handling literal null value to match Java fix: Handling literal null value to match Java Mar 25, 2026
@allenshen13 allenshen13 force-pushed the fix-array-split-into-chunks-null-handling branch from 2f6ccf0 to 2c4f8d5 Compare March 25, 2026 18:19
@allenshen13 allenshen13 requested a review from czentgr March 25, 2026 19:48
@allenshen13
Copy link
Copy Markdown
Collaborator Author

Currently there are 3 CI failures:

  1. Signature changes: the removal of the 12 type specific signatures is flagged as a backwards incompatible change, but the Generic registration covers all those types.

2 + 3) Biased Expression Fuzzer + Presto Bias Fuzzer: Both fail for the same reason: the fuzzer generates nested array_split_into_chunks calls that produce intermediate empty arrays. Presto Java's implementation uses sequence(1, 0, sz) for empty arrays, which throws because stop < start with a positive step. Velox correctly returns an empty array. This is a Presto Java bug.

Copy link
Copy Markdown
Collaborator

@jkhaliqi jkhaliqi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @allenshen13

@jkhaliqi jkhaliqi self-requested a review April 9, 2026 18:20
@jkhaliqi
Copy link
Copy Markdown
Collaborator

jkhaliqi commented Apr 9, 2026

I see this closes prestodb/presto#27429 and there is a TODO in the code linked with that issue as well https://github.com/prestodb/presto/blob/d488205f1fdedddee6a537e25b704799d3b5e5b8/presto-native-tests/src/test/java/com/facebook/presto/nativetests/TestSqlInvokedFunctions.java#L116. Lets complete the TODO and add e2e tests in Presto as well.

@github-actions
Copy link
Copy Markdown

github-actions Bot commented Apr 16, 2026

CI Failure Analysis

Auto-generated by the CI Failure Analysis workflow. This comment is updated in place each time CI fails on a new commit, so it always reflects the latest run — re-pushing or re-running CI will refresh the analysis below. Last updated 2026-04-28 23:13:05 UTC from workflow run 25079066484.

❌ Signature Changes — SIGNATURE Failure View logs

Signature errors:

Incompatible changes in function signatures have been detected.

array_split_into_chunks has its function signature '(array(varbinary),integer) -> array(array(varbinary))' removed.
array_split_into_chunks has its function signature '(array(date),integer) -> array(array(date))' removed.
array_split_into_chunks has its function signature '(array(timestamp),integer) -> array(array(timestamp))' removed.
array_split_into_chunks has its function signature '(array(boolean),integer) -> array(array(boolean))' removed.
array_split_into_chunks has its function signature '(array(double),integer) -> array(array(double))' removed.
array_split_into_chunks has its function signature '(array(hugeint),integer) -> array(array(hugeint))' removed.
array_split_into_chunks has its function signature '(array(integer),integer) -> array(array(integer))' removed.
array_split_into_chunks has its function signature '(array(real),integer) -> array(array(real))' removed.
array_split_into_chunks has its function signature '(array(smallint),integer) -> array(array(smallint))' removed.
array_split_into_chunks has its function signature '(array(varchar),integer) -> array(array(varchar))' removed.
array_split_into_chunks has its function signature '(array(bigint),integer) -> array(array(bigint))' removed.
array_split_into_chunks has its function signature '(array(tinyint),integer) -> array(array(tinyint))' removed.

Changing or removing function signatures breaks backwards compatibility as some
users may rely on function signatures that no longer exist.

The PR replaces 12 type-specific registrations with a single Generic<T1> registration in ArraySplitIntoChunksRegistration.cpp. While this is functionally equivalent (the generic signature covers all types), the CI's signature-change checker flags this as a backwards-incompatible removal because the concrete type signatures no longer appear in the signature file.


❌ Expression Fuzzer with Presto SOT — FUZZER Failure View logs

Fuzzer: Presto Expression Fuzzer with Presto as source of truth
Failed instance: 3 (seed: 711308906)

Velox and reference DB results don't match

Expected 100, got 100
1 extra rows, 1 missing rows
1 of extra rows:
  [...,{"key":2139185135656720395,...}] | 69
1 of missing rows:
  [...,{"key":2139185150402320395,...}] | 69

File: velox/expression/tests/ExpressionVerifier.cpp:475
Function: verify

The results differ in a single map key value: Velox returned 2139185135656720395 while Presto returned 2139185150402320395. This is a data mismatch in a map(bigint, varbinary) column, suggesting a precision/value discrepancy in map key generation or comparison.


❌ Biased Expression Fuzzer — FUZZER Failure View logs

Fuzzer: Biased Expression Fuzzer (biased toward array_split_into_chunks)
Seed: 2488
Biased functions: array_split_into_chunks=20

ExpressionVerifier.cpp:121, Function:reduceToSelectedRows
Expression: cnt > 0 (0 vs. 0)

Reason: (0 vs. 0)
Function: reduceToSelectedRows
File: velox/expression/tests/ExpressionVerifier.cpp
Line: 121

The fuzzer, biased specifically toward array_split_into_chunks, crashed because reduceToSelectedRows found no valid rows after filtering (count = 0). This occurs after repeated Presto query failures with "sequence stop value should be greater than or equal to start value" errors, which are unrelated Presto-side sequence() function errors that reduce the available result rows to zero.


Correlation with PR changes:

  • Signature Changes: Directly caused by the PR. The PR removes 12 concrete type registrations of array_split_into_chunks (int8, int16, int32, int64, int128, float, double, bool, Timestamp, Date, Varchar, Varbinary) and replaces them with a single Generic<T1> registration in ArraySplitIntoChunksRegistration.cpp. The signature checker detects the removal of these concrete signatures as a backwards-incompatible change.
  • Biased Expression Fuzzer: Directly related to the PR. This job specifically fuzzes the array_split_into_chunks function that the PR modifies. The crash in reduceToSelectedRows (cnt > 0 assertion failure) likely occurs because the generic registration changes how the function handles certain type combinations during fuzzing.
  • Expression Fuzzer with Presto SOT: Likely unrelated to the PR. The mismatch is in map key values (a bigint precision issue), not in array_split_into_chunks. This appears to be a pre-existing flaky fuzzer issue.

Known issues:

  • No open issues track these specific fuzzer failures.
  • The array_split_into_chunks function was recently added (see issue #16483).
  • The most recent Fuzzer Jobs run on main (run 25038628763) passed successfully, suggesting the Expression Fuzzer with Presto SOT failure may be intermittent/flaky.

Reproduce locally:

For the Signature Changes failure, update the signature file to reflect the new generic signature after removing concrete type registrations.

For the Biased Expression Fuzzer:

./velox_expression_fuzzer_test \
    --seed 2488 \
    --only=array_split_into_chunks \
    --lazy_vector_generation_ratio 0.2 \
    --common_dictionary_wraps_generation_ratio=0.3 \
    --duration_sec 300 \
    --enable_variadic_signatures \
    --velox_fuzzer_enable_complex_types \
    --velox_fuzzer_enable_column_reuse \
    --velox_fuzzer_enable_expression_reuse \
    --max_expression_trees_per_step 2 \
    --retry_with_try \
    --batch_size=6 \
    --presto_url=http://127.0.0.1:8080

For the Expression Fuzzer with Presto SOT (instance 3):

./velox_expression_fuzzer_test \
    --seed 711308906 \
    --enable_variadic_signatures \
    --velox_fuzzer_enable_complex_types \
    --lazy_vector_generation_ratio 0.2 \
    --common_dictionary_wraps_generation_ratio=0.3 \
    --velox_fuzzer_enable_column_reuse \
    --velox_fuzzer_enable_expression_reuse \
    --enable_dereference \
    --duration_sec 300 \
    --special_forms="cast,coalesce,if" \
    --velox_fuzzer_max_level_of_nesting=1 \
    --presto_url=http://127.0.0.1:8080

Recommended fix:

  1. Signature Changes: Update the signature file to include the new generic signature (array(T1),integer) -> array(array(T1)) and remove the 12 concrete type signatures. This is typically done by running the signature generation tool and committing the updated signature file.
  2. Biased Expression Fuzzer: Investigate whether the Generic<T1> registration properly handles all edge cases that the concrete type registrations covered. The repeated "sequence stop value" Presto errors suggest the fuzzer generates inputs that Presto rejects, eventually exhausting all valid test rows.

@github-actions
Copy link
Copy Markdown

github-actions Bot commented Apr 20, 2026

Build Impact Analysis

Selective Build Targets (building these covers all 299 affected)

cmake --build _build/release --target aggregate_companion_functions_test physical_size_aggregator_test presto_sql_test spark_aggregation_fuzzer_test spark_expression_fuzzer_test velox_abfs_test velox_aggregates_GeometryAggregateTest velox_aggregates_reduce_agg_bm velox_aggregates_simple_aggregates_bm velox_aggregates_string_keys_bm velox_aggregates_test_group0 velox_aggregates_test_group1 velox_aggregates_test_group2 velox_aggregates_test_group3 velox_aggregates_test_group4 velox_aggregation_fuzzer_test velox_aggregation_runner_test velox_benchmark_array_writer_no_nulls velox_benchmark_array_writer_with_nulls velox_benchmark_basic_comparison_conjunct velox_benchmark_basic_decoded_vector velox_benchmark_basic_preproc velox_benchmark_basic_selectivity_vector velox_benchmark_basic_simple_arithmetic velox_benchmark_basic_simple_cast velox_benchmark_basic_vector_compare velox_benchmark_basic_vector_fuzzer velox_benchmark_basic_vector_slice velox_benchmark_estimate_flat_size velox_benchmark_expr_flat_no_nulls velox_benchmark_feature_normalization velox_benchmark_map_writer_no_nulls velox_benchmark_map_writer_with_nulls velox_benchmark_nested_array_writer_no_nulls velox_benchmark_nested_array_writer_with_nulls velox_cache_fuzzer velox_cast_benchmark velox_common_test velox_constrained_input_generators_test velox_constrained_vector_generator_test velox_core_test velox_driver_test velox_duckdb_conversion_test velox_dwio_cache_test velox_dwio_common_test velox_dwio_dwrf_buffered_output_stream_test velox_dwio_dwrf_byte_rle_encoder_test velox_dwio_dwrf_byte_rle_test velox_dwio_dwrf_checksum_test velox_dwio_dwrf_column_reader_test velox_dwio_dwrf_column_statistics_test velox_dwio_dwrf_compression_test velox_dwio_dwrf_config_test velox_dwio_dwrf_data_buffer_holder_test velox_dwio_dwrf_decompression_test velox_dwio_dwrf_decryption_test velox_dwio_dwrf_dictionary_encoder_test velox_dwio_dwrf_dictionary_encoding_utils_test velox_dwio_dwrf_encoding_selector_test velox_dwio_dwrf_encryption_test velox_dwio_dwrf_flush_policy_test velox_dwio_dwrf_index_builder_test velox_dwio_dwrf_int_direct_test velox_dwio_dwrf_int_encoder_test velox_dwio_dwrf_layout_planner_test velox_dwio_dwrf_ratio_checker_test velox_dwio_dwrf_reader_base_test velox_dwio_dwrf_reader_test velox_dwio_dwrf_rle_test velox_dwio_dwrf_rlev1_encoder_test velox_dwio_dwrf_stream_labels_test velox_dwio_dwrf_stripe_dictionary_cache_test velox_dwio_dwrf_stripe_reader_base_test velox_dwio_dwrf_stripe_stream_test velox_dwio_dwrf_writer_context_test velox_dwio_dwrf_writer_encoding_manager_test velox_dwio_dwrf_writer_sink_test velox_dwio_dwrf_writer_test velox_dwio_iceberg_reader_benchmark velox_dwio_orc_column_statistics_test velox_dwio_orc_reader_filter_test velox_dwio_orc_reader_test velox_dwio_parquet_page_reader_test velox_dwio_parquet_reader_benchmark velox_dwio_parquet_reader_test velox_dwio_parquet_rlebp_decoder_test velox_dwio_parquet_structure_decoder_test velox_dwio_parquet_table_scan_test velox_dwio_parquet_tpch_test velox_dwrf_column_writer_index_test velox_dwrf_column_writer_stats_test velox_dwrf_column_writer_test velox_dwrf_e2e_filter_test velox_dwrf_e2e_reader_test velox_dwrf_e2e_writer_test velox_dwrf_statistics_builder_utils_test velox_dwrf_writer_extended_test velox_dwrf_writer_flush_test velox_example_operator_extensibility velox_example_scan_orc velox_exchange_benchmark velox_exchange_fuzzer velox_exec_SpatialJoinTest velox_exec_bm_duplicate_project velox_exec_infra_test velox_exec_prefixsort_test velox_exec_test_group0 velox_exec_test_group1 velox_exec_test_group2 velox_exec_test_group3 velox_exec_test_group4 velox_exec_test_group5 velox_exec_test_group6 velox_exec_test_group7 velox_exec_util_test_group0 velox_expression_fuzzer_test velox_expression_fuzzer_unit_test velox_expression_runner_test velox_expression_runner_unit_test velox_expression_test velox_expression_verifier_unit_test velox_filemetadata_test velox_filter_project_benchmark velox_format_datetime_benchmark velox_function_dynamic_link_test velox_function_registry_test velox_functions_aggregates_test velox_functions_benchmarks_compare velox_functions_benchmarks_row_writer_no_nulls velox_functions_benchmarks_simdjson_function_with_expr velox_functions_benchmarks_string_writer_no_nulls velox_functions_benchmarks_url velox_functions_iceberg_test velox_functions_lib_test velox_functions_prestosql_benchmarks_array_contains velox_functions_prestosql_benchmarks_array_min_max velox_functions_prestosql_benchmarks_array_position velox_functions_prestosql_benchmarks_array_sum velox_functions_prestosql_benchmarks_bitwise velox_functions_prestosql_benchmarks_cardinality velox_functions_prestosql_benchmarks_comparisons velox_functions_prestosql_benchmarks_concat velox_functions_prestosql_benchmarks_date_time velox_functions_prestosql_benchmarks_field_reference velox_functions_prestosql_benchmarks_generic velox_functions_prestosql_benchmarks_in velox_functions_prestosql_benchmarks_map_concat velox_functions_prestosql_benchmarks_map_except velox_functions_prestosql_benchmarks_map_input velox_functions_prestosql_benchmarks_map_intersect velox_functions_prestosql_benchmarks_map_subscript velox_functions_prestosql_benchmarks_map_zip_with velox_functions_prestosql_benchmarks_not velox_functions_prestosql_benchmarks_regexp_replace velox_functions_prestosql_benchmarks_row velox_functions_prestosql_benchmarks_string_ascii_utf_functions velox_functions_prestosql_benchmarks_uuid_cast velox_functions_prestosql_benchmarks_width_bucket velox_functions_prestosql_benchmarks_zip velox_functions_prestosql_benchmarks_zip_with velox_functions_spark_aggregates_test velox_functions_spark_test velox_functions_test velox_fuzzer_connector_test velox_gcs_file_test velox_gcs_insert_test velox_gcs_multiendpoints_test velox_hash_benchmark velox_hash_join_build_benchmark velox_hash_join_list_result_benchmark velox_hash_join_prepare_join_table_benchmark velox_hdfs_file_test velox_hdfs_insert_test velox_hive_connector_test velox_hive_iceberg_deletion_vector_test velox_hive_iceberg_deletion_vector_writer_test velox_hive_iceberg_dwrf_insert_test velox_hive_iceberg_equality_delete_test velox_hive_iceberg_insert_test velox_hive_iceberg_test velox_hive_partition_function_benchmark velox_in_10_min_demo velox_join_fuzzer velox_key_encoder_test velox_like_benchmark velox_like_tpch_benchmark velox_mark_distinct_fuzzer velox_mark_sorted_benchmark velox_memory_arbitration_fuzzer velox_memory_test velox_numeric_upcast_benchmark velox_orderby_benchmark velox_parquet_e2e_filter_test velox_parquet_writer_sink_test velox_parquet_writer_test velox_prefixsort_benchmark velox_presto_type_parser_test velox_presto_types_fuzzer_utils_test velox_presto_types_test velox_prestosql_coverage velox_query_replayer velox_re2_functions_benchmarks velox_row_number_fuzzer velox_row_serializer_benchmark velox_row_test velox_rpc_operator_test velox_s3file_test velox_s3insert_test velox_s3metrics_test velox_s3multiendpoints_test velox_s3read_test velox_s3registration_test velox_serializer_benchmark velox_serializer_test_group0 velox_simple_aggregate_test velox_sort_benchmark velox_spark_function_registry_test velox_spark_query_runner_test velox_spark_windows_test velox_sparksql_benchmarks_cast velox_sparksql_benchmarks_compare velox_sparksql_benchmarks_from_json velox_sparksql_benchmarks_get_funcs velox_sparksql_benchmarks_hash velox_sparksql_benchmarks_in velox_sparksql_benchmarks_simd_compare velox_sparksql_benchmarks_split velox_sparksql_coverage velox_spatial_join_benchmark velox_spatial_join_fuzzer velox_spiller_aggregate_benchmark velox_spiller_join_benchmark velox_streaming_aggregation_benchmark velox_table_evolution_fuzzer_test velox_text_reader_test velox_text_writer_test velox_tool_trace_test velox_topn_row_number_fuzzer velox_tpcds_benchmark velox_tpcds_connector_test velox_tpch_benchmark velox_tpch_connector_test velox_tpch_speed_test velox_unsafe_row_serialize_benchmark velox_vector_fuzzer_test velox_vector_test velox_wave_benchmark velox_wave_exec_test velox_window_fuzzer_test velox_window_prefixsort_benchmark velox_window_sub_partitioned_sort_benchmark velox_windows_agg_test velox_windows_rank_test velox_windows_value_test velox_writer_fuzzer_test

Total affected: 299/571 targets

Affected targets (299)

Directly changed (2)

Target Changed Files
velox_functions_prestosql ArraySplitIntoChunksRegistration.cpp
velox_functions_test ArraySplitIntoChunksTest.cpp

Transitively affected (297)

  • aggregate_companion_functions_test
  • physical_size_aggregator_test
  • presto_sql_test
  • spark_aggregation_fuzzer_test
  • spark_expression_fuzzer_test
  • velox_abfs_test
  • velox_aggregates
  • velox_aggregates_GeometryAggregateTest
  • velox_aggregates_reduce_agg_bm
  • velox_aggregates_simple_aggregates_bm
  • velox_aggregates_string_keys_bm
  • velox_aggregates_test_group0
  • velox_aggregates_test_group1
  • velox_aggregates_test_group2
  • velox_aggregates_test_group3
  • velox_aggregates_test_group4
  • velox_aggregation_fuzzer
  • velox_aggregation_fuzzer_base
  • velox_aggregation_fuzzer_test
  • velox_aggregation_result_verifier
  • velox_aggregation_runner_test
  • velox_benchmark_array_writer_no_nulls
  • velox_benchmark_array_writer_with_nulls
  • velox_benchmark_basic_comparison_conjunct
  • velox_benchmark_basic_decoded_vector
  • velox_benchmark_basic_preproc
  • velox_benchmark_basic_selectivity_vector
  • velox_benchmark_basic_simple_arithmetic
  • velox_benchmark_basic_simple_cast
  • velox_benchmark_basic_vector_compare
  • velox_benchmark_basic_vector_fuzzer
  • velox_benchmark_basic_vector_slice
  • velox_benchmark_builder
  • velox_benchmark_estimate_flat_size
  • velox_benchmark_expr_flat_no_nulls
  • velox_benchmark_feature_normalization
  • velox_benchmark_map_writer_no_nulls
  • velox_benchmark_map_writer_with_nulls
  • velox_benchmark_nested_array_writer_no_nulls
  • velox_benchmark_nested_array_writer_with_nulls
  • velox_cache_fuzzer
  • velox_cast_benchmark
  • velox_common_test
  • velox_constrained_input_generators
  • velox_constrained_input_generators_test
  • velox_constrained_vector_generator
  • velox_constrained_vector_generator_test
  • velox_core_test
  • velox_driver_test
  • velox_duckdb_conversion_test
  • velox_dwio_cache_test
  • velox_dwio_common_test
  • velox_dwio_common_test_utils
  • velox_dwio_dwrf_buffered_output_stream_test
  • velox_dwio_dwrf_byte_rle_encoder_test
  • velox_dwio_dwrf_byte_rle_test
  • velox_dwio_dwrf_checksum_test
  • velox_dwio_dwrf_column_reader_test
  • velox_dwio_dwrf_column_statistics_test
  • velox_dwio_dwrf_compression_test
  • velox_dwio_dwrf_config_test
  • velox_dwio_dwrf_data_buffer_holder_test
  • velox_dwio_dwrf_decompression_test
  • velox_dwio_dwrf_decryption_test
  • velox_dwio_dwrf_dictionary_encoder_test
  • velox_dwio_dwrf_dictionary_encoding_utils_test
  • velox_dwio_dwrf_encoding_selector_test
  • velox_dwio_dwrf_encryption_test
  • velox_dwio_dwrf_flush_policy_test
  • velox_dwio_dwrf_index_builder_test
  • velox_dwio_dwrf_int_direct_test
  • velox_dwio_dwrf_int_encoder_test
  • velox_dwio_dwrf_layout_planner_test
  • velox_dwio_dwrf_ratio_checker_test
  • velox_dwio_dwrf_reader_base_test
  • velox_dwio_dwrf_reader_test
  • velox_dwio_dwrf_rle_test
  • velox_dwio_dwrf_rlev1_encoder_test
  • velox_dwio_dwrf_stream_labels_test
  • velox_dwio_dwrf_stripe_dictionary_cache_test
  • velox_dwio_dwrf_stripe_reader_base_test
  • velox_dwio_dwrf_stripe_stream_test
  • velox_dwio_dwrf_writer_context_test
  • velox_dwio_dwrf_writer_encoding_manager_test
  • velox_dwio_dwrf_writer_sink_test
  • velox_dwio_dwrf_writer_test
  • velox_dwio_iceberg_reader_benchmark
  • velox_dwio_iceberg_reader_benchmark_lib
  • velox_dwio_orc_column_statistics_test
  • velox_dwio_orc_reader_filter_test
  • velox_dwio_orc_reader_test
  • velox_dwio_parquet_page_reader_test
  • velox_dwio_parquet_reader_benchmark
  • velox_dwio_parquet_reader_benchmark_lib
  • velox_dwio_parquet_reader_test
  • velox_dwio_parquet_rlebp_decoder_test
  • velox_dwio_parquet_structure_decoder_test
  • velox_dwio_parquet_table_scan_test
  • velox_dwio_parquet_tpch_test
  • velox_dwrf_column_writer_index_test
  • velox_dwrf_column_writer_stats_test
  • velox_dwrf_column_writer_test
  • velox_dwrf_e2e_filter_test
  • velox_dwrf_e2e_reader_test
  • velox_dwrf_e2e_writer_test
  • velox_dwrf_statistics_builder_utils_test
  • velox_dwrf_test_utils
  • velox_dwrf_writer_extended_test
  • velox_dwrf_writer_flush_test
  • velox_example_operator_extensibility
  • velox_example_scan_orc
  • velox_exchange_benchmark
  • velox_exchange_fuzzer
  • velox_exec_SpatialJoinTest
  • velox_exec_bm_duplicate_project
  • velox_exec_infra_test
  • velox_exec_prefixsort_test
  • velox_exec_test_group0
  • velox_exec_test_group1
  • velox_exec_test_group2
  • velox_exec_test_group3
  • velox_exec_test_group4
  • velox_exec_test_group5
  • velox_exec_test_group6
  • velox_exec_test_group7
  • velox_exec_test_lib
  • velox_exec_util_test_group0
  • velox_expression_fuzzer
  • velox_expression_fuzzer_test
  • velox_expression_fuzzer_unit_test
  • velox_expression_runner
  • velox_expression_runner_test
  • velox_expression_runner_unit_test
  • velox_expression_test
  • velox_expression_test_utility
  • velox_expression_verifier
  • velox_expression_verifier_unit_test
  • velox_filemetadata_test
  • velox_filter_project_benchmark
  • velox_format_datetime_benchmark
  • velox_function_dynamic_link_test
  • velox_function_registry_test
  • velox_functions_aggregates_test
  • velox_functions_aggregates_test_lib
  • velox_functions_benchmarks_compare
  • velox_functions_benchmarks_row_writer_no_nulls
  • velox_functions_benchmarks_simdjson_function_with_expr
  • velox_functions_benchmarks_string_writer_no_nulls
  • velox_functions_benchmarks_url
  • velox_functions_iceberg_test
  • velox_functions_lib_test
  • velox_functions_prestosql_benchmarks_array_contains
  • velox_functions_prestosql_benchmarks_array_min_max
  • velox_functions_prestosql_benchmarks_array_position
  • velox_functions_prestosql_benchmarks_array_sum
  • velox_functions_prestosql_benchmarks_bitwise
  • velox_functions_prestosql_benchmarks_cardinality
  • velox_functions_prestosql_benchmarks_comparisons
  • velox_functions_prestosql_benchmarks_concat
  • velox_functions_prestosql_benchmarks_date_time
  • velox_functions_prestosql_benchmarks_field_reference
  • velox_functions_prestosql_benchmarks_generic
  • velox_functions_prestosql_benchmarks_in
  • velox_functions_prestosql_benchmarks_map_concat
  • velox_functions_prestosql_benchmarks_map_except
  • velox_functions_prestosql_benchmarks_map_input
  • velox_functions_prestosql_benchmarks_map_intersect
  • velox_functions_prestosql_benchmarks_map_subscript
  • velox_functions_prestosql_benchmarks_map_zip_with
  • velox_functions_prestosql_benchmarks_not
  • velox_functions_prestosql_benchmarks_regexp_replace
  • velox_functions_prestosql_benchmarks_row
  • velox_functions_prestosql_benchmarks_string_ascii_utf_functions
  • velox_functions_prestosql_benchmarks_uuid_cast
  • velox_functions_prestosql_benchmarks_width_bucket
  • velox_functions_prestosql_benchmarks_zip
  • velox_functions_prestosql_benchmarks_zip_with
  • velox_functions_prestosql_impl
  • velox_functions_spark
  • velox_functions_spark_aggregates
  • velox_functions_spark_aggregates_test
  • velox_functions_spark_impl
  • velox_functions_spark_test
  • velox_functions_test_lib
  • velox_functions_window_test_lib
  • velox_fuzzer_connector
  • velox_fuzzer_connector_test
  • velox_fuzzer_util
  • velox_gcs_file_test
  • velox_gcs_insert_test
  • velox_gcs_multiendpoints_test
  • velox_hash_benchmark
  • velox_hash_join_build_benchmark
  • velox_hash_join_list_result_benchmark
  • velox_hash_join_prepare_join_table_benchmark
  • velox_hdfs_file_test
  • velox_hdfs_insert_test
  • velox_hive_connector_test
  • velox_hive_iceberg_deletion_vector_test
  • velox_hive_iceberg_deletion_vector_writer_test
  • velox_hive_iceberg_dwrf_insert_test
  • velox_hive_iceberg_equality_delete_test
  • velox_hive_iceberg_insert_test
  • velox_hive_iceberg_test
  • velox_hive_partition_function_benchmark
  • velox_in_10_min_demo
  • velox_join_fuzzer
  • velox_key_encoder_test
  • velox_like_benchmark
  • velox_like_tpch_benchmark
  • velox_mark_distinct_fuzzer
  • velox_mark_distinct_fuzzer_lib
  • velox_mark_sorted_benchmark
  • velox_memory_arbitration_fuzzer
  • velox_memory_test
  • velox_numeric_upcast_benchmark
  • velox_orderby_benchmark
  • velox_orderby_benchmark_util
  • velox_parquet_e2e_filter_test
  • velox_parquet_writer_sink_test
  • velox_parquet_writer_test
  • velox_prefixsort_benchmark
  • velox_presto_type_parser_test
  • velox_presto_types
  • velox_presto_types_fuzzer_utils
  • velox_presto_types_fuzzer_utils_test
  • velox_presto_types_test
  • velox_prestosql_coverage
  • velox_query_benchmark
  • velox_query_replayer
  • velox_query_trace_replayer_base
  • velox_re2_functions_benchmarks
  • velox_row_number_fuzzer
  • velox_row_number_fuzzer_lib
  • velox_row_serializer_benchmark
  • velox_row_test
  • velox_rpc_operator_test
  • velox_s3file_test
  • velox_s3insert_test
  • velox_s3metrics_test
  • velox_s3multiendpoints_test
  • velox_s3read_test
  • velox_s3registration_test
  • velox_serializer_benchmark
  • velox_serializer_test_group0
  • velox_simple_aggregate
  • velox_simple_aggregate_test
  • velox_sort_benchmark
  • velox_spark_function_registry_test
  • velox_spark_query_runner
  • velox_spark_query_runner_test
  • velox_spark_windows_test
  • velox_sparksql_benchmarks_cast
  • velox_sparksql_benchmarks_compare
  • velox_sparksql_benchmarks_from_json
  • velox_sparksql_benchmarks_get_funcs
  • velox_sparksql_benchmarks_hash
  • velox_sparksql_benchmarks_in
  • velox_sparksql_benchmarks_simd_compare
  • velox_sparksql_benchmarks_split
  • velox_sparksql_coverage
  • velox_spatial_join_benchmark
  • velox_spatial_join_fuzzer
  • velox_spill_fuzzer_base_lib
  • velox_spiller_aggregate_benchmark
  • velox_spiller_aggregate_benchmark_base
  • velox_spiller_join_benchmark
  • velox_spiller_join_benchmark_base
  • velox_streaming_aggregation_benchmark
  • velox_table_evolution_fuzzer_test
  • velox_text_reader_test
  • velox_text_writer_test
  • velox_tool_trace_test
  • velox_topn_row_number_fuzzer
  • velox_topn_row_number_fuzzer_lib
  • velox_tpcds_benchmark
  • velox_tpcds_benchmark_lib
  • velox_tpcds_connector_test
  • velox_tpch_benchmark
  • velox_tpch_benchmark_lib
  • velox_tpch_connector_test
  • velox_tpch_speed_test
  • velox_unsafe_row_serialize_benchmark
  • velox_vector_fuzzer
  • velox_vector_fuzzer_test
  • velox_vector_test
  • velox_wave_benchmark
  • velox_wave_exec_test
  • velox_window_fuzzer
  • velox_window_fuzzer_test
  • velox_window_prefixsort_benchmark
  • velox_window_sub_partitioned_sort_benchmark
  • velox_windows_agg_test
  • velox_windows_rank_test
  • velox_windows_value_test
  • velox_writer_fuzzer
  • velox_writer_fuzzer_test

Fast path • Graph from main@5335309b03218136f89df30080e9389a42f6f2c7

@github-actions
Copy link
Copy Markdown

CI Failure Analysis

❌ Signature Changes — SIGNATURE Failure View logs

Signature errors:

The PR removes 12 type-specific registrations of array_split_into_chunks in favor of a single Generic<T1> registration. The Signature Changes CI check detects this as a backwards-incompatible change because the following concrete function signatures no longer appear in the signature file:

array_split_into_chunks has its function signature '(array(varbinary),integer) -> array(array(varbinary))' removed.
array_split_into_chunks has its function signature '(array(date),integer) -> array(array(date))' removed.
array_split_into_chunks has its function signature '(array(timestamp),integer) -> array(array(timestamp))' removed.
array_split_into_chunks has its function signature '(array(boolean),integer) -> array(array(boolean))' removed.
array_split_into_chunks has its function signature '(array(double),integer) -> array(array(double))' removed.
array_split_into_chunks has its function signature '(array(hugeint),integer) -> array(array(hugeint))' removed.
array_split_into_chunks has its function signature '(array(integer),integer) -> array(array(integer))' removed.
array_split_into_chunks has its function signature '(array(real),integer) -> array(array(real))' removed.
array_split_into_chunks has its function signature '(array(smallint),integer) -> array(array(smallint))' removed.
array_split_into_chunks has its function signature '(array(varchar),integer) -> array(array(varchar))' removed.
array_split_into_chunks has its function signature '(array(bigint),integer) -> array(array(bigint))' removed.
array_split_into_chunks has its function signature '(array(tinyint),integer) -> array(array(tinyint))' removed.

Correlation with PR changes:

  • This failure is directly caused by the PR changes. The PR removes the 12 type-specific template instantiations in ArraySplitIntoChunksRegistration.cpp (lines 33–44 deleted) and keeps only the Generic<T1> registration. While the generic registration functionally covers all these types, the CI signature checker sees the concrete signatures disappearing from the exported signature list and flags it as a backwards-incompatible change.

Known issues:

  • No open issues track this failure. The Fuzzer Jobs workflow passes on main, confirming this is not a pre-existing failure.

Recommended fix:

  • The signature baseline file needs to be updated to reflect the new generic-only signatures. Typically this is done by updating the committed signature file (usually via a script or CI artifact). Check the repo's documentation or CI workflow for how to regenerate/approve signature changes (e.g., there may be a mechanism to accept intentional signature changes by updating the baseline).
  • Alternatively, if this is intentional and the generic signature (array(__user_T1),integer) -> array(array(__user_T1)) is a strict superset of the removed ones, the signature baseline should be updated and a note added to the PR explaining why the change is safe.

🤖 Generated with Claude Code

@allenshen13 allenshen13 changed the title fix: Handling literal null value to match Java fix: Resolve null literal in array_split_into_chunks via generic registration Apr 27, 2026
allenshen13 added a commit to prestodb/presto that referenced this pull request Apr 27, 2026
Narrow PR scope to enabling the native e2e test for array_split_into_chunks
null literal handling (now possible thanks to facebookincubator/velox#16923).

Revert:
- The IF(cardinality(input) = 0, array[], ...) guard in ArraySqlFunctions.
- The empty-array assertions in TestArraySqlFunctions.
- The empty-array assertions in AbstractTestSqlInvokedFunctions.

Keep the TestSqlInvokedFunctions TODO override removal so the parent test's
"array_split_into_chunks(null, 2)" assertion runs against the native engine.

Both Velox and Presto Java still throw on empty input today; aligning them
on array[] is a separate semantic decision out of scope for #27429.
Narrow PR scope to the null-literal fix (generic registration). Restore
the VELOX_USER_CHECK_GT on empty input and its throw-based test. Empty
arrays still throw on both Velox and Presto Java today, so the fuzzer
stays green under case "both paths threw." Whether to change this to
return array[] is a separate semantic decision, tracked independently.
@allenshen13 allenshen13 force-pushed the fix-array-split-into-chunks-null-handling branch from 6a33522 to cbf0f5a Compare April 28, 2026 21:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Fix null literal handling for array_split_into_chunks in sidecar

2 participants