-
Notifications
You must be signed in to change notification settings - Fork 1.2k
/
35.0.0.md
295 lines (281 loc) · 33.6 KB
/
35.0.0.md
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
<!---
Licensed to the Apache Software Foundation (ASF) under one
or more contributor license agreements. See the NOTICE file
distributed with this work for additional information
regarding copyright ownership. The ASF licenses this file
to you under the Apache License, Version 2.0 (the
"License"); you may not use this file except in compliance
with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing,
software distributed under the License is distributed on an
"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
KIND, either express or implied. See the License for the
specific language governing permissions and limitations
under the License.
-->
## [35.0.0](https://github.com/apache/arrow-datafusion/tree/35.0.0) (2024-01-20)
[Full Changelog](https://github.com/apache/arrow-datafusion/compare/34.0.0...35.0.0)
**Breaking changes:**
- Minor: make SubqueryAlias::try_new take Arc<LogicalPlan> [#8542](https://github.com/apache/arrow-datafusion/pull/8542) (sadboy)
- Remove ListingTable and FileScanConfig Unbounded (#8540) [#8573](https://github.com/apache/arrow-datafusion/pull/8573) (tustvold)
- Rename `ParamValues::{LIST -> List,MAP -> Map}` [#8611](https://github.com/apache/arrow-datafusion/pull/8611) (kawadakk)
- Rename `expr::window_function::WindowFunction` to `WindowFunctionDefinition`, make structure consistent with ScalarFunction [#8382](https://github.com/apache/arrow-datafusion/pull/8382) (edmondop)
- Implement `ScalarUDF` in terms of `ScalarUDFImpl` trait [#8713](https://github.com/apache/arrow-datafusion/pull/8713) (alamb)
- Change `ScalarValue::{List, LargeList, FixedSizedList}` to take specific types rather than `ArrayRef` [#8562](https://github.com/apache/arrow-datafusion/pull/8562) (rspears74)
- Remove unused array_expression.rs and `SUPPORTED_ARRAY_TYPES` [#8807](https://github.com/apache/arrow-datafusion/pull/8807) (alamb)
- Simplify physical expression creation API (not require schema) [#8823](https://github.com/apache/arrow-datafusion/pull/8823) (comphead)
- Determine causal window frames to produce early results. [#8842](https://github.com/apache/arrow-datafusion/pull/8842) (mustafasrepo)
**Implemented enhancements:**
- feat: implement Unary Expr in substrait [#8534](https://github.com/apache/arrow-datafusion/pull/8534) (waynexia)
- feat: implement Repartition plan in substrait [#8526](https://github.com/apache/arrow-datafusion/pull/8526) (waynexia)
- feat: support largelist in array_slice [#8561](https://github.com/apache/arrow-datafusion/pull/8561) (Weijun-H)
- feat: support `LargeList` in `array_positions` [#8571](https://github.com/apache/arrow-datafusion/pull/8571) (Weijun-H)
- feat: support `LargeList` in `array_element` [#8570](https://github.com/apache/arrow-datafusion/pull/8570) (Weijun-H)
- feat: support `LargeList` in `array_dims` [#8592](https://github.com/apache/arrow-datafusion/pull/8592) (Weijun-H)
- feat: support `LargeList` in `array_remove` [#8595](https://github.com/apache/arrow-datafusion/pull/8595) (Weijun-H)
- feat: support inlist in LiteralGurantee for pruning [#8654](https://github.com/apache/arrow-datafusion/pull/8654) (my-vegetable-has-exploded)
- feat: support 'LargeList' in `array_pop_front` and `array_pop_back` [#8569](https://github.com/apache/arrow-datafusion/pull/8569) (Weijun-H)
- feat: support `LargeList` in `array_position` [#8714](https://github.com/apache/arrow-datafusion/pull/8714) (Weijun-H)
- feat: support `LargeList` in `array_ndims` [#8716](https://github.com/apache/arrow-datafusion/pull/8716) (Weijun-H)
- feat: remove filters with null constants [#8700](https://github.com/apache/arrow-datafusion/pull/8700) (asimsedhain)
- feat: support LargeList in array_repeat [#8725](https://github.com/apache/arrow-datafusion/pull/8725) (Weijun-H)
- feat: native types in `DistinctCountAccumulator` for primitive types [#8721](https://github.com/apache/arrow-datafusion/pull/8721) (korowa)
- feat: support `LargeList` in `cardinality` [#8726](https://github.com/apache/arrow-datafusion/pull/8726) (Weijun-H)
- feat: support `largelist` in `array_to_string` [#8729](https://github.com/apache/arrow-datafusion/pull/8729) (Weijun-H)
- feat: Add bloom filter metric to ParquetExec [#8772](https://github.com/apache/arrow-datafusion/pull/8772) (my-vegetable-has-exploded)
- feat: support `array_resize` [#8744](https://github.com/apache/arrow-datafusion/pull/8744) (Weijun-H)
- feat: add more components to the wasm-pack compatible list [#8843](https://github.com/apache/arrow-datafusion/pull/8843) (waynexia)
**Fixed bugs:**
- fix: make sure CASE WHEN pick first true branch when WHEN clause is true [#8477](https://github.com/apache/arrow-datafusion/pull/8477) (haohuaijin)
- fix: `Antarctica/Vostok` tz offset changed in chrono-tz 0.8.5 [#8677](https://github.com/apache/arrow-datafusion/pull/8677) (korowa)
- fix: struct field don't push down to TableScan [#8774](https://github.com/apache/arrow-datafusion/pull/8774) (haohuaijin)
- fix: failed to create ValuesExec with non-nullable schema [#8776](https://github.com/apache/arrow-datafusion/pull/8776) (jonahgao)
- fix: fix markdown table in docs [#8812](https://github.com/apache/arrow-datafusion/pull/8812) (tshauck)
- fix: don't extract common sub expr in `CASE WHEN` clause [#8833](https://github.com/apache/arrow-datafusion/pull/8833) (haohuaijin)
**Documentation updates:**
- docs: update udf docs for udtf [#8546](https://github.com/apache/arrow-datafusion/pull/8546) (tshauck)
- Doc: Clarify When Limit is Pushed Down to TableProvider::Scan [#8686](https://github.com/apache/arrow-datafusion/pull/8686) (devinjdangelo)
- Minor: Improve `PruningPredicate` docstrings [#8748](https://github.com/apache/arrow-datafusion/pull/8748) (alamb)
- Minor: Add documentation about stream cancellation [#8747](https://github.com/apache/arrow-datafusion/pull/8747) (alamb)
- docs: add sudo for install commands [#8804](https://github.com/apache/arrow-datafusion/pull/8804) (caicancai)
- docs: document SessionConfig [#8771](https://github.com/apache/arrow-datafusion/pull/8771) (wjones127)
- Upgrade to object_store `0.9.0` and arrow `50.0.0` [#8758](https://github.com/apache/arrow-datafusion/pull/8758) (tustvold)
- docs: fix wrong pushdown name & a typo [#8875](https://github.com/apache/arrow-datafusion/pull/8875) (SteveLauC)
- docs: Update contributor guide with installation instructions [#8876](https://github.com/apache/arrow-datafusion/pull/8876) (caicancai)
- docs: fix wrong name in sub-crates' README [#8889](https://github.com/apache/arrow-datafusion/pull/8889) (SteveLauC)
- docs: add an example for RecordBatchReceiverStreamBuilder [#8888](https://github.com/apache/arrow-datafusion/pull/8888) (SteveLauC)
**Merged pull requests:**
- Remove order_bys from AggregateExec state [#8537](https://github.com/apache/arrow-datafusion/pull/8537) (mustafasrepo)
- Fix count(null) and count(distinct null) [#8511](https://github.com/apache/arrow-datafusion/pull/8511) (joroKr21)
- Minor: reduce code duplication in `date_bin_impl` [#8528](https://github.com/apache/arrow-datafusion/pull/8528) (Weijun-H)
- Add metrics for UnnestExec [#8482](https://github.com/apache/arrow-datafusion/pull/8482) (simonvandel)
- Prepare 34.0.0-rc3 [#8549](https://github.com/apache/arrow-datafusion/pull/8549) (andygrove)
- fix: make sure CASE WHEN pick first true branch when WHEN clause is true [#8477](https://github.com/apache/arrow-datafusion/pull/8477) (haohuaijin)
- Minor: make SubqueryAlias::try_new take Arc<LogicalPlan> [#8542](https://github.com/apache/arrow-datafusion/pull/8542) (sadboy)
- Fallback on null empty value in ExprBoundaries::try_from_column [#8501](https://github.com/apache/arrow-datafusion/pull/8501) (razeghi71)
- Add test for DataFrame::write_table [#8531](https://github.com/apache/arrow-datafusion/pull/8531) (devinjdangelo)
- [MINOR]: Generate empty column at placeholder exec [#8553](https://github.com/apache/arrow-datafusion/pull/8553) (mustafasrepo)
- Minor: Remove now dead `SUPPORTED_STRUCT_TYPES` [#8480](https://github.com/apache/arrow-datafusion/pull/8480) (alamb)
- [MINOR]: Add getter methods to first and last value [#8555](https://github.com/apache/arrow-datafusion/pull/8555) (mustafasrepo)
- [MINOR]: Some code changes and a new empty batch guard for SHJ [#8557](https://github.com/apache/arrow-datafusion/pull/8557) (metesynnada)
- docs: update udf docs for udtf [#8546](https://github.com/apache/arrow-datafusion/pull/8546) (tshauck)
- feat: implement Unary Expr in substrait [#8534](https://github.com/apache/arrow-datafusion/pull/8534) (waynexia)
- Fix `compute_record_batch_statistics` wrong with `projection` [#8489](https://github.com/apache/arrow-datafusion/pull/8489) (Asura7969)
- Minor: Cleanup warning in scalar.rs test [#8563](https://github.com/apache/arrow-datafusion/pull/8563) (jayzhan211)
- Minor: move some invariants out of the loop [#8564](https://github.com/apache/arrow-datafusion/pull/8564) (haohuaijin)
- feat: implement Repartition plan in substrait [#8526](https://github.com/apache/arrow-datafusion/pull/8526) (waynexia)
- Fix sort order aware file group parallelization [#8517](https://github.com/apache/arrow-datafusion/pull/8517) (alamb)
- feat: support largelist in array_slice [#8561](https://github.com/apache/arrow-datafusion/pull/8561) (Weijun-H)
- minor: fix to support scalars [#8559](https://github.com/apache/arrow-datafusion/pull/8559) (comphead)
- refactor: `HashJoinStream` state machine [#8538](https://github.com/apache/arrow-datafusion/pull/8538) (korowa)
- Remove ListingTable and FileScanConfig Unbounded (#8540) [#8573](https://github.com/apache/arrow-datafusion/pull/8573) (tustvold)
- Update substrait requirement from 0.20.0 to 0.21.0 [#8574](https://github.com/apache/arrow-datafusion/pull/8574) (dependabot[bot])
- [minor]: Fix rank calculation bug when empty order by is seen [#8567](https://github.com/apache/arrow-datafusion/pull/8567) (mustafasrepo)
- Add `LiteralGuarantee` on columns to extract conditions required for `PhysicalExpr` expressions to evaluate to true [#8437](https://github.com/apache/arrow-datafusion/pull/8437) (alamb)
- [MINOR]: Parametrize sort-preservation tests to exercise all situations (unbounded/bounded sources and flag behavior) [#8575](https://github.com/apache/arrow-datafusion/pull/8575) (mustafasrepo)
- Minor: Add some comments to scalar_udf example [#8576](https://github.com/apache/arrow-datafusion/pull/8576) (alamb)
- Move Coercion for MakeArray to `coerce_arguments_for_signature` and introduce another one for ArrayAppend [#8317](https://github.com/apache/arrow-datafusion/pull/8317) (jayzhan211)
- feat: support `LargeList` in `array_positions` [#8571](https://github.com/apache/arrow-datafusion/pull/8571) (Weijun-H)
- feat: support `LargeList` in `array_element` [#8570](https://github.com/apache/arrow-datafusion/pull/8570) (Weijun-H)
- Increase test coverage for unbounded and bounded cases [#8581](https://github.com/apache/arrow-datafusion/pull/8581) (mustafasrepo)
- Port tests in `parquet.rs` to sqllogictest [#8560](https://github.com/apache/arrow-datafusion/pull/8560) (hiltontj)
- Minor: avoid a copy in Expr::unalias [#8588](https://github.com/apache/arrow-datafusion/pull/8588) (alamb)
- Minor: support complex expr as the arg in the ApproxPercentileCont function [#8580](https://github.com/apache/arrow-datafusion/pull/8580) (liukun4515)
- Bugfix: Add functional dependency check and aggregate try_new schema [#8584](https://github.com/apache/arrow-datafusion/pull/8584) (mustafasrepo)
- Remove GroupByOrderMode [#8593](https://github.com/apache/arrow-datafusion/pull/8593) (ozankabak)
- Minor: replace` not-impl-err` in `array_expression` [#8589](https://github.com/apache/arrow-datafusion/pull/8589) (Weijun-H)
- Substrait insubquery [#8363](https://github.com/apache/arrow-datafusion/pull/8363) (tgujar)
- Minor: port last test from parquet.rs [#8587](https://github.com/apache/arrow-datafusion/pull/8587) (alamb)
- Minor: consolidate map sqllogictest tests [#8550](https://github.com/apache/arrow-datafusion/pull/8550) (alamb)
- feat: support `LargeList` in `array_dims` [#8592](https://github.com/apache/arrow-datafusion/pull/8592) (Weijun-H)
- Fix regression in regenerating protobuf source [#8603](https://github.com/apache/arrow-datafusion/pull/8603) (andygrove)
- Remove unbounded_input from FileSinkOptions [#8605](https://github.com/apache/arrow-datafusion/pull/8605) (devinjdangelo)
- Add `arrow_err!` macros, optional backtrace to ArrowError [#8586](https://github.com/apache/arrow-datafusion/pull/8586) (comphead)
- Add examples of DataFrame::write\* methods without S3 dependency [#8606](https://github.com/apache/arrow-datafusion/pull/8606) (devinjdangelo)
- Implement logical plan serde for CopyTo [#8618](https://github.com/apache/arrow-datafusion/pull/8618) (andygrove)
- Fix InListExpr to return the correct number of rows [#8601](https://github.com/apache/arrow-datafusion/pull/8601) (alamb)
- Remove ListingTable single_file option [#8604](https://github.com/apache/arrow-datafusion/pull/8604) (devinjdangelo)
- feat: support `LargeList` in `array_remove` [#8595](https://github.com/apache/arrow-datafusion/pull/8595) (Weijun-H)
- Rename `ParamValues::{LIST -> List,MAP -> Map}` [#8611](https://github.com/apache/arrow-datafusion/pull/8611) (kawadakk)
- Support binary temporal coercion for Date64 and Timestamp types [#8616](https://github.com/apache/arrow-datafusion/pull/8616) (Asura7969)
- Add new configuration item `listing_table_ignore_subdirectory` [#8565](https://github.com/apache/arrow-datafusion/pull/8565) (Asura7969)
- Optimize the parameter types of `ParamValues`'s methods [#8613](https://github.com/apache/arrow-datafusion/pull/8613) (kawadakk)
- Do not panic on zero placeholders in `ParamValues::get_placeholders_with_values` [#8615](https://github.com/apache/arrow-datafusion/pull/8615) (kawadakk)
- Fix #8507: Non-null sub-field on nullable struct-field has wrong nullity [#8623](https://github.com/apache/arrow-datafusion/pull/8623) (marvinlanhenke)
- Implement `contained` API in PruningPredicate [#8440](https://github.com/apache/arrow-datafusion/pull/8440) (alamb)
- Add partial serde support for ParquetWriterOptions [#8627](https://github.com/apache/arrow-datafusion/pull/8627) (andygrove)
- Minor: add arguments length check in `array_expressions` [#8622](https://github.com/apache/arrow-datafusion/pull/8622) (Weijun-H)
- Minor: improve dataframe functional dependency tests [#8630](https://github.com/apache/arrow-datafusion/pull/8630) (alamb)
- Improve regexp_match performance by avoiding cloning Regex [#8631](https://github.com/apache/arrow-datafusion/pull/8631) (viirya)
- Minor: improve `listing_table_ignore_subdirectory` config documentation [#8634](https://github.com/apache/arrow-datafusion/pull/8634) (alamb)
- Support Writing Arrow files [#8608](https://github.com/apache/arrow-datafusion/pull/8608) (devinjdangelo)
- Filter pushdown into cross join [#8626](https://github.com/apache/arrow-datafusion/pull/8626) (mustafasrepo)
- [MINOR] Remove duplicate test utility and move one utility function for better organization [#8652](https://github.com/apache/arrow-datafusion/pull/8652) (metesynnada)
- [MINOR]: Add new test for filter pushdown into cross join [#8648](https://github.com/apache/arrow-datafusion/pull/8648) (mustafasrepo)
- Rewrite bloom filters to use contains API [#8442](https://github.com/apache/arrow-datafusion/pull/8442) (alamb)
- Split equivalence code into smaller modules. [#8649](https://github.com/apache/arrow-datafusion/pull/8649) (tushushu)
- Move parquet_schema.rs from sql to parquet tests [#8644](https://github.com/apache/arrow-datafusion/pull/8644) (alamb)
- Fix group by aliased expression in LogicalPLanBuilder::aggregate [#8629](https://github.com/apache/arrow-datafusion/pull/8629) (alamb)
- Refactor `array_union` and `array_intersect` functions to one general function [#8516](https://github.com/apache/arrow-datafusion/pull/8516) (Weijun-H)
- Minor: avoid extra clone in datafusion-proto::physical_plan [#8650](https://github.com/apache/arrow-datafusion/pull/8650) (ongchi)
- Minor: name some constant values in arrow writer, parquet writer [#8642](https://github.com/apache/arrow-datafusion/pull/8642) (alamb)
- TreeNode Refactor Part 2 [#8653](https://github.com/apache/arrow-datafusion/pull/8653) (berkaysynnada)
- feat: support inlist in LiteralGurantee for pruning [#8654](https://github.com/apache/arrow-datafusion/pull/8654) (my-vegetable-has-exploded)
- Streaming CLI support [#8651](https://github.com/apache/arrow-datafusion/pull/8651) (berkaysynnada)
- Add serde support for CSV FileTypeWriterOptions [#8641](https://github.com/apache/arrow-datafusion/pull/8641) (andygrove)
- Add trait based ScalarUDF API [#8578](https://github.com/apache/arrow-datafusion/pull/8578) (alamb)
- Handle ordering of first last aggregation inside aggregator [#8662](https://github.com/apache/arrow-datafusion/pull/8662) (mustafasrepo)
- feat: support 'LargeList' in `array_pop_front` and `array_pop_back` [#8569](https://github.com/apache/arrow-datafusion/pull/8569) (Weijun-H)
- chore: rename ceresdb to apache horaedb [#8674](https://github.com/apache/arrow-datafusion/pull/8674) (tanruixiang)
- Minor: clean up code [#8671](https://github.com/apache/arrow-datafusion/pull/8671) (Weijun-H)
- fix: `Antarctica/Vostok` tz offset changed in chrono-tz 0.8.5 [#8677](https://github.com/apache/arrow-datafusion/pull/8677) (korowa)
- Make the BatchSerializer behind Arc to avoid unnecessary struct creation [#8666](https://github.com/apache/arrow-datafusion/pull/8666) (metesynnada)
- Implement serde for CSV and Parquet FileSinkExec [#8646](https://github.com/apache/arrow-datafusion/pull/8646) (andygrove)
- [pruning] Add shortcut when all units have been pruned [#8675](https://github.com/apache/arrow-datafusion/pull/8675) (Ted-Jiang)
- Change first/last implementation to prevent redundant comparisons when data is already sorted [#8678](https://github.com/apache/arrow-datafusion/pull/8678) (mustafasrepo)
- minor: remove useless conversion [#8684](https://github.com/apache/arrow-datafusion/pull/8684) (comphead)
- refactor: modified `JoinHashMap` build order for `HashJoinStream` [#8658](https://github.com/apache/arrow-datafusion/pull/8658) (korowa)
- Start setting up tpch planning benchmarks [#8665](https://github.com/apache/arrow-datafusion/pull/8665) (matthewmturner)
- Doc: Clarify When Limit is Pushed Down to TableProvider::Scan [#8686](https://github.com/apache/arrow-datafusion/pull/8686) (devinjdangelo)
- Closes #8502: Parallel NDJSON file reading [#8659](https://github.com/apache/arrow-datafusion/pull/8659) (marvinlanhenke)
- Improve `array_prepend` signature for null and empty array [#8625](https://github.com/apache/arrow-datafusion/pull/8625) (jayzhan211)
- Cleanup TreeNode implementations [#8672](https://github.com/apache/arrow-datafusion/pull/8672) (viirya)
- Update sqlparser requirement from 0.40.0 to 0.41.0 [#8647](https://github.com/apache/arrow-datafusion/pull/8647) (dependabot[bot])
- Update scalar functions doc for extract/datepart [#8682](https://github.com/apache/arrow-datafusion/pull/8682) (Jefffrey)
- Remove DescribeTableStmt in parser in favour of existing functionality from sqlparser-rs [#8703](https://github.com/apache/arrow-datafusion/pull/8703) (Jefffrey)
- Simplify `NULL [NOT] IN (..)` expressions [#8691](https://github.com/apache/arrow-datafusion/pull/8691) (asimsedhain)
- Rename `expr::window_function::WindowFunction` to `WindowFunctionDefinition`, make structure consistent with ScalarFunction [#8382](https://github.com/apache/arrow-datafusion/pull/8382) (edmondop)
- Deprecate duplicate function `LogicalPlan::with_new_inputs` [#8707](https://github.com/apache/arrow-datafusion/pull/8707) (viirya)
- Minor: refactor bloom filter tests to reduce duplication [#8435](https://github.com/apache/arrow-datafusion/pull/8435) (alamb)
- Minor: clean up code based on `Clippy` [#8715](https://github.com/apache/arrow-datafusion/pull/8715) (Weijun-H)
- Minor: Unbounded Output of AnalyzeExec [#8717](https://github.com/apache/arrow-datafusion/pull/8717) (berkaysynnada)
- feat: support `LargeList` in `array_position` [#8714](https://github.com/apache/arrow-datafusion/pull/8714) (Weijun-H)
- feat: support `LargeList` in `array_ndims` [#8716](https://github.com/apache/arrow-datafusion/pull/8716) (Weijun-H)
- feat: remove filters with null constants [#8700](https://github.com/apache/arrow-datafusion/pull/8700) (asimsedhain)
- support `LargeList` in `array_prepend` and `array_append` [#8679](https://github.com/apache/arrow-datafusion/pull/8679) (Weijun-H)
- Support for `extract(epoch from date)` for Date32 and Date64 [#8695](https://github.com/apache/arrow-datafusion/pull/8695) (Jefffrey)
- Implement trait based API for defining WindowUDF [#8719](https://github.com/apache/arrow-datafusion/pull/8719) (guojidan)
- Minor: Introduce utils::hash for StructArray [#8552](https://github.com/apache/arrow-datafusion/pull/8552) (jayzhan211)
- [CI] Improve windows machine CI test time [#8730](https://github.com/apache/arrow-datafusion/pull/8730) (comphead)
- fix guarantees in allways_true of PruningPredicate [#8732](https://github.com/apache/arrow-datafusion/pull/8732) (my-vegetable-has-exploded)
- Minor: Avoid memory copy in construct window exprs [#8718](https://github.com/apache/arrow-datafusion/pull/8718) (Ted-Jiang)
- feat: support LargeList in array_repeat [#8725](https://github.com/apache/arrow-datafusion/pull/8725) (Weijun-H)
- Minor: Ctrl+C Termination in CLI [#8739](https://github.com/apache/arrow-datafusion/pull/8739) (berkaysynnada)
- Add support for functional dependency for ROW_NUMBER window function. [#8737](https://github.com/apache/arrow-datafusion/pull/8737) (mustafasrepo)
- Minor: reduce code duplication in PruningPredicate test [#8441](https://github.com/apache/arrow-datafusion/pull/8441) (alamb)
- feat: native types in `DistinctCountAccumulator` for primitive types [#8721](https://github.com/apache/arrow-datafusion/pull/8721) (korowa)
- [MINOR]: Add a test case for when target partition is 1, no hash repartition is added to the plan. [#8757](https://github.com/apache/arrow-datafusion/pull/8757) (mustafasrepo)
- Minor: Improve `PruningPredicate` docstrings [#8748](https://github.com/apache/arrow-datafusion/pull/8748) (alamb)
- feat: support `LargeList` in `cardinality` [#8726](https://github.com/apache/arrow-datafusion/pull/8726) (Weijun-H)
- Add reproducer for #8738 [#8750](https://github.com/apache/arrow-datafusion/pull/8750) (alamb)
- Minor: Use faster check for column name in schema merge [#8765](https://github.com/apache/arrow-datafusion/pull/8765) (matthewmturner)
- Minor: Add documentation about stream cancellation [#8747](https://github.com/apache/arrow-datafusion/pull/8747) (alamb)
- Move `repartition_file_scans` out of `enable_round_robin` check in `EnforceDistribution` rule [#8731](https://github.com/apache/arrow-datafusion/pull/8731) (viirya)
- Clean internal implementation of WindowUDF [#8746](https://github.com/apache/arrow-datafusion/pull/8746) (guojidan)
- feat: support `largelist` in `array_to_string` [#8729](https://github.com/apache/arrow-datafusion/pull/8729) (Weijun-H)
- [MINOR] CLI error handling on streaming use cases [#8761](https://github.com/apache/arrow-datafusion/pull/8761) (metesynnada)
- Convert Binary Operator `StringConcat` to Function for `array_concat`, `array_append` and `array_prepend` [#8636](https://github.com/apache/arrow-datafusion/pull/8636) (jayzhan211)
- Minor: Fix incorrect indices for hashing struct [#8775](https://github.com/apache/arrow-datafusion/pull/8775) (jayzhan211)
- Minor: Improve library docs to mention TreeNode, ExprSimplifier, PruningPredicate and cp_solver [#8749](https://github.com/apache/arrow-datafusion/pull/8749) (alamb)
- [MINOR] Add logo source files [#8762](https://github.com/apache/arrow-datafusion/pull/8762) (andygrove)
- Add Apache attribution to site footer [#8760](https://github.com/apache/arrow-datafusion/pull/8760) (alamb)
- ci: speed up win64 test [#8728](https://github.com/apache/arrow-datafusion/pull/8728) (Jefffrey)
- Add `schema_err!` error macros with optional backtrace [#8620](https://github.com/apache/arrow-datafusion/pull/8620) (comphead)
- Fix regression by reverting Materialize dictionaries in group keys [#8740](https://github.com/apache/arrow-datafusion/pull/8740) (alamb)
- fix: struct field don't push down to TableScan [#8774](https://github.com/apache/arrow-datafusion/pull/8774) (haohuaijin)
- Implement `ScalarUDF` in terms of `ScalarUDFImpl` trait [#8713](https://github.com/apache/arrow-datafusion/pull/8713) (alamb)
- Minor: Fix error messages in array expressions [#8781](https://github.com/apache/arrow-datafusion/pull/8781) (Weijun-H)
- Move tests from `expr.rs` to sqllogictests. Part1 [#8773](https://github.com/apache/arrow-datafusion/pull/8773) (comphead)
- Permit running `sqllogictest` as a rust test in IDEs (+ use clap for sqllogicttest parsing, accept (and ignore) rust test harness arguments) [#8288](https://github.com/apache/arrow-datafusion/pull/8288) (alamb)
- Minor: Use standard tree walk in Projection Pushdown [#8787](https://github.com/apache/arrow-datafusion/pull/8787) (alamb)
- Implement trait based API for define AggregateUDF [#8733](https://github.com/apache/arrow-datafusion/pull/8733) (guojidan)
- Minor: Improve `DataFusionError` documentation [#8792](https://github.com/apache/arrow-datafusion/pull/8792) (alamb)
- fix: failed to create ValuesExec with non-nullable schema [#8776](https://github.com/apache/arrow-datafusion/pull/8776) (jonahgao)
- Update substrait requirement from 0.21.0 to 0.22.1 [#8796](https://github.com/apache/arrow-datafusion/pull/8796) (dependabot[bot])
- Bump follow-redirects from 1.15.3 to 1.15.4 in /datafusion/wasmtest/datafusion-wasm-app [#8798](https://github.com/apache/arrow-datafusion/pull/8798) (dependabot[bot])
- Minor: array_pop_first should be array_pop_front in documentation [#8797](https://github.com/apache/arrow-datafusion/pull/8797) (ongchi)
- feat: Add bloom filter metric to ParquetExec [#8772](https://github.com/apache/arrow-datafusion/pull/8772) (my-vegetable-has-exploded)
- Add note on using larger row group size [#8745](https://github.com/apache/arrow-datafusion/pull/8745) (twitu)
- Change `ScalarValue::{List, LargeList, FixedSizedList}` to take specific types rather than `ArrayRef` [#8562](https://github.com/apache/arrow-datafusion/pull/8562) (rspears74)
- fix: fix markdown table in docs [#8812](https://github.com/apache/arrow-datafusion/pull/8812) (tshauck)
- docs: add sudo for install commands [#8804](https://github.com/apache/arrow-datafusion/pull/8804) (caicancai)
- Standardize `CompressionTypeVariant` encoding in protobuf [#8785](https://github.com/apache/arrow-datafusion/pull/8785) (tushushu)
- Make benefits_from_input_partitioning Default in SHJ [#8801](https://github.com/apache/arrow-datafusion/pull/8801) (metesynnada)
- refactor: standardize exec_from funcs arg order [#8809](https://github.com/apache/arrow-datafusion/pull/8809) (tshauck)
- [Minor] extract const and add doc and more tests for in_list pruning [#8815](https://github.com/apache/arrow-datafusion/pull/8815) (Ted-Jiang)
- [MINOR]: Add size check for aggregate [#8813](https://github.com/apache/arrow-datafusion/pull/8813) (mustafasrepo)
- Minor: chores: Update clippy in pre-commit.sh [#8810](https://github.com/apache/arrow-datafusion/pull/8810) (my-vegetable-has-exploded)
- Cleanup the usage of round-robin repartitioning [#8794](https://github.com/apache/arrow-datafusion/pull/8794) (viirya)
- Implement monotonicity for ScalarUDF [#8799](https://github.com/apache/arrow-datafusion/pull/8799) (guojidan)
- Remove unused array_expression.rs and `SUPPORTED_ARRAY_TYPES` [#8807](https://github.com/apache/arrow-datafusion/pull/8807) (alamb)
- feat: support `array_resize` [#8744](https://github.com/apache/arrow-datafusion/pull/8744) (Weijun-H)
- Minor: typo in `arrays.slt` [#8831](https://github.com/apache/arrow-datafusion/pull/8831) (Weijun-H)
- docs: document SessionConfig [#8771](https://github.com/apache/arrow-datafusion/pull/8771) (wjones127)
- Minor: Improve `datafusion-proto` documentation [#8822](https://github.com/apache/arrow-datafusion/pull/8822) (alamb)
- [CI] Refactor CI builders [#8826](https://github.com/apache/arrow-datafusion/pull/8826) (comphead)
- Serialize function signature simplifications [#8802](https://github.com/apache/arrow-datafusion/pull/8802) (metesynnada)
- Port tests in `group_by.rs` to sqllogictest [#8834](https://github.com/apache/arrow-datafusion/pull/8834) (hiltontj)
- Simplify physical expression creation API (not require schema) [#8823](https://github.com/apache/arrow-datafusion/pull/8823) (comphead)
- feat: add more components to the wasm-pack compatible list [#8843](https://github.com/apache/arrow-datafusion/pull/8843) (waynexia)
- Port tests in timestamp.rs to sqllogictest. Part 1 [#8818](https://github.com/apache/arrow-datafusion/pull/8818) (caicancai)
- Upgrade to object_store `0.9.0` and arrow `50.0.0` [#8758](https://github.com/apache/arrow-datafusion/pull/8758) (tustvold)
- Fix ApproxPercentileCont signature [#8825](https://github.com/apache/arrow-datafusion/pull/8825) (joroKr21)
- Minor: Update `with_column_rename` method doc [#8858](https://github.com/apache/arrow-datafusion/pull/8858) (comphead)
- Minor: Document `parquet_metadata` function [#8852](https://github.com/apache/arrow-datafusion/pull/8852) (alamb)
- Speedup new_with_metadata by removing sort [#8855](https://github.com/apache/arrow-datafusion/pull/8855) (simonvandel)
- Minor: fix wrong function call [#8847](https://github.com/apache/arrow-datafusion/pull/8847) (Weijun-H)
- Add options of parquet bloom filter and page index in Session config [#8869](https://github.com/apache/arrow-datafusion/pull/8869) (Ted-Jiang)
- Port tests in timestamp.rs to sqllogictest [#8859](https://github.com/apache/arrow-datafusion/pull/8859) (caicancai)
- test: Port `order.rs` tests to sqllogictest [#8857](https://github.com/apache/arrow-datafusion/pull/8857) (simicd)
- Determine causal window frames to produce early results. [#8842](https://github.com/apache/arrow-datafusion/pull/8842) (mustafasrepo)
- docs: fix wrong pushdown name & a typo [#8875](https://github.com/apache/arrow-datafusion/pull/8875) (SteveLauC)
- fix: don't extract common sub expr in `CASE WHEN` clause [#8833](https://github.com/apache/arrow-datafusion/pull/8833) (haohuaijin)
- Add "Extended" clickbench queries [#8861](https://github.com/apache/arrow-datafusion/pull/8861) (alamb)
- Change cli to propagate error to exit code [#8856](https://github.com/apache/arrow-datafusion/pull/8856) (tshauck)
- test: Port tests in `predicates.rs` to sqllogictest [#8879](https://github.com/apache/arrow-datafusion/pull/8879) (simicd)
- docs: Update contributor guide with installation instructions [#8876](https://github.com/apache/arrow-datafusion/pull/8876) (caicancai)
- Minor: add tests for casts between nested `List` and `LargeList` [#8882](https://github.com/apache/arrow-datafusion/pull/8882) (Weijun-H)
- Disable Parallel Parquet Writer by Default, Improve Writing Test Coverage [#8854](https://github.com/apache/arrow-datafusion/pull/8854) (devinjdangelo)
- Support for order sensitive `NTH_VALUE` aggregation, make reverse `ARRAY_AGG` more efficient [#8841](https://github.com/apache/arrow-datafusion/pull/8841) (mustafasrepo)
- test: Port tests in `csv_files.rs` to sqllogictest [#8885](https://github.com/apache/arrow-datafusion/pull/8885) (simicd)
- test: Port tests in `references.rs` to sqllogictest [#8877](https://github.com/apache/arrow-datafusion/pull/8877) (simicd)
- fix bug with `to_timestamp` and `InitCap` logical serialization, add roundtrip test between expression and proto, [#8868](https://github.com/apache/arrow-datafusion/pull/8868) (Weijun-H)
- Support `LargeListArray` scalar values and `align_array_dimensions` [#8881](https://github.com/apache/arrow-datafusion/pull/8881) (Weijun-H)
- refactor: rename FileStream.file_reader to file_opener & update doc [#8883](https://github.com/apache/arrow-datafusion/pull/8883) (SteveLauC)
- docs: fix wrong name in sub-crates' README [#8889](https://github.com/apache/arrow-datafusion/pull/8889) (SteveLauC)
- Recursive CTEs: Stage 1 - add config flag [#8828](https://github.com/apache/arrow-datafusion/pull/8828) (matthewgapp)
- Support array literal with scalar function [#8884](https://github.com/apache/arrow-datafusion/pull/8884) (jayzhan211)
- Bump actions/cache from 3 to 4 [#8903](https://github.com/apache/arrow-datafusion/pull/8903) (dependabot[bot])
- Fix `datafusion-cli` print output [#8895](https://github.com/apache/arrow-datafusion/pull/8895) (alamb)
- docs: add an example for RecordBatchReceiverStreamBuilder [#8888](https://github.com/apache/arrow-datafusion/pull/8888) (SteveLauC)
- Fix "Projection references non-aggregate values" by updating `rebase_expr` to use `transform_down` [#8890](https://github.com/apache/arrow-datafusion/pull/8890) (wizardxz)
- Add serde support for Arrow FileTypeWriterOptions [#8850](https://github.com/apache/arrow-datafusion/pull/8850) (tushushu)
- Improve `datafusion-cli` print format tests [#8896](https://github.com/apache/arrow-datafusion/pull/8896) (alamb)
- Recursive CTEs: Stage 2 - add support for sql -> logical plan generation [#8839](https://github.com/apache/arrow-datafusion/pull/8839) (matthewgapp)
- Minor: remove null in `array-append` and `array-prepend` [#8901](https://github.com/apache/arrow-datafusion/pull/8901) (Weijun-H)
- Add support for FixedSizeList type in `arrow_cast`, hashing [#8344](https://github.com/apache/arrow-datafusion/pull/8344) (Weijun-H)
- aggregate_statistics should only optimize MIN/MAX when relation is not empty [#8914](https://github.com/apache/arrow-datafusion/pull/8914) (viirya)
- support to_timestamp with optional chrono formats [#8886](https://github.com/apache/arrow-datafusion/pull/8886) (Omega359)
- Minor: Document third argument of `date_bin` as optional and default value [#8912](https://github.com/apache/arrow-datafusion/pull/8912) (alamb)
- Minor: distinguish parquet row group pruning type in unit test [#8921](https://github.com/apache/arrow-datafusion/pull/8921) (Ted-Jiang)