-
Notifications
You must be signed in to change notification settings - Fork 45
Fix InsertRelation on attached database #155
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
evertlammerts
merged 2 commits into
duckdb:v1.4-andium
from
evertlammerts:insert_rel_fix
Nov 10, 2025
Merged
Fix InsertRelation on attached database #155
evertlammerts
merged 2 commits into
duckdb:v1.4-andium
from
evertlammerts:insert_rel_fix
Nov 10, 2025
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Tishj
reviewed
Nov 5, 2025
Tishj
reviewed
Nov 5, 2025
Tishj
requested changes
Nov 5, 2025
Collaborator
Tishj
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have some questions, but I'm also missing a test
57d7a05 to
c26e02d
Compare
Tishj
previously approved these changes
Nov 6, 2025
Collaborator
Tishj
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
lnkuiper
added a commit
to duckdb/duckdb
that referenced
this pull request
Nov 7, 2025
Fixes #18396 Related PR in duckdb-python: duckdb/duckdb-python#155
c26e02d to
97fd6c6
Compare
97fd6c6 to
20bfd52
Compare
mach-kernel
added a commit
to spiceai/duckdb
that referenced
this pull request
Nov 14, 2025
Squashed commit of the following: commit 68d7555f68bd25c1a251ccca2e6338949c33986a Merge: 3d4d568674 9c6efc7d89 Author: Mark <[email protected]> Date: Tue Nov 11 11:59:30 2025 +0100 Fix minor crypto issues (#19716) commit 3d4d568674d1e05d221e8326c0d180336c350f18 Merge: 7386b4485d 0dea05daf8 Author: Mark <[email protected]> Date: Tue Nov 11 10:58:18 2025 +0100 Logs to be case-insensitive also at enable_logging callsite (#19734) Currently `CALL enable_logging('http');` would succeed, but then select an empty subset of the available logs (`http` != `HTTP`), due to a quirk in the code. This PR fixes that up. commit 7386b4485d23bc99c9f6efab6ce0e33ecc23222b Merge: 1ef3444f09 d4a77c801b Author: Mark <[email protected]> Date: Tue Nov 11 09:28:13 2025 +0100 Add explicit Initialize(HTTPParam&) method to HTTPClient (#19723) This allow explicit re-initialization of specific parts of HTTPClient(s) This diff would allow patterns such reusing partially constructed (but properly re-initialized) HTTPClient objects ```c++ struct CrossQueryState { // in some state kept around unique_ptr<HTTPClient>& client; }; void SomeFunction() { // ... http_util.Request(get_request, client); // some more logic, same query http_util.Request(get_request, client); } void SomeOtherFunction() { // Re-initialize part of the client, given some settings might have changed auto http_params = HTTPParams(http_util) client->Initialize(http_params); // ... http_util.Request(get_request, client); // some more logic, same query http_util.Request(get_request, client); } ``` Note that PR is fully opt-in from users, while if you implement a file-system abstraction inheriting from HTTPClient you should get a compiler error pointing to implementing the relevant function. commit 9c6efc7d89ee5ca60598c7e43778c0e9b34b266b Author: Mark <[email protected]> Date: Tue Nov 11 08:09:02 2025 +0100 Fix typo commit e52f71387731da1202fc33755922999a472218a1 Author: Mark <[email protected]> Date: Tue Nov 11 08:08:32 2025 +0100 Add require to test commit 1ef3444f09b1df6e4a7cc3ad1d67868ecaa1a6a4 Merge: 8090b8d52e dff5b7f608 Author: Mark <[email protected]> Date: Tue Nov 11 08:07:17 2025 +0100 Bump the Postgres scanner extension (#19730) commit 0dea05daf823237a2de28ec7c0fec53dbb006475 Author: Carlo Piovesan <[email protected]> Date: Tue Nov 11 06:42:36 2025 +0100 Logs to be case-insensitive also at enable_logging callsite commit 8090b8d52ed6bfd31b72013f6800cea89539cc2f Merge: 6667c7a3ec 5e9f88863f Author: Mark <[email protected]> Date: Mon Nov 10 21:34:42 2025 +0100 [Dev] Fix assertion failure for empty ColumnData serialization (#19713) The `PersistentColumnData` constructor asserts that the pointers aren't empty. This assertion will fail if we try to serialize the child of a list, if all lists are empty (as the child will be entirely empty then) Backported fix for problem found by: #19674 commit 6667c7a3ecdc56cc144a9bcf8601001af66e6839 Merge: 3f0ad6958f 4a0f4b0b38 Author: Mark <[email protected]> Date: Mon Nov 10 21:32:58 2025 +0100 Bump httpfs and resume testing on Windows (#19714) commit dff5b7f608b732a0e7c5d9a68e7e8d7db3c48478 Author: Mytherin <[email protected]> Date: Mon Nov 10 21:31:46 2025 +0100 Bump the Postgres scanner extension commit 0e3d0b5af535fcde90d272d95b1d08cb5fb12d15 Author: Sam Ansmink <[email protected]> Date: Mon Nov 10 21:26:43 2025 +0100 remove deleted file from patch commit ffb7be7cc5f27d9945d6868f76ef769a3f8a43d4 Merge: 2142f0b10d 3f0ad6958f Author: Sam Ansmink <[email protected]> Date: Mon Nov 10 21:13:17 2025 +0100 Merge branch 'v1.4-andium' into fix-crypto-issue commit 2142f0b10db72b89c9101fa65ead619182f8e5d1 Author: Sam Ansmink <[email protected]> Date: Mon Nov 10 20:55:18 2025 +0100 fix duplicate job id commit 0a225cb99a130c2b1635d6ced03bc37f01ff9436 Author: Sam Ansmink <[email protected]> Date: Mon Nov 10 20:52:40 2025 +0100 fix ci for encryption commit 3f0ad6958f1952a083bc499fc147f69504a3c6d2 Merge: f3fb834ef7 a1eeb0df6f Author: Mark <[email protected]> Date: Mon Nov 10 20:09:11 2025 +0100 Fix #19700: correctly sort output selection vector in nested selection operations (#19718) Fixes #19700 This probably should be maintained during the actual select - but for now just sorting it afterwards solves the issue. commit f3fb834ef7153b90ef3908eb51a5b85efa580ca5 Merge: 7333a0ae84 c8ddca6f3c Author: Mark <[email protected]> Date: Mon Nov 10 20:09:03 2025 +0100 Fix #19355: correctly resolve subquery in MERGE INTO action condition (#19720) Fixes #19355 commit 7333a0ae84d51729fffe91e67f12c3cee526af2a Merge: 95fcb8f188 6595848a27 Author: Mark <[email protected]> Date: Mon Nov 10 16:46:31 2025 +0100 Bump: delta, ducklake, httpfs (#19715) This PR bumps the following extensions: - `delta` from `0747c23791` to `6515bb2560` - `ducklake` from `022cfb1373` to `77f2512a67` - `httpfs` from `b80c680f86` to `041a782b0b` commit 35f98411037cb0499e236d0cbe20d6b3a0dcc43f Author: Sam Ansmink <[email protected]> Date: Mon Nov 10 14:51:29 2025 +0100 install curl commit d4a77c801bb1a88e634c12bc64e185ef2f147d2d Author: Carlo Piovesan <[email protected]> Date: Mon Nov 10 14:37:42 2025 +0100 Add explicit Initialize(HTTPParams&) method to HTTPClient This allow explicit re-initialization of specific parts of HTTPClient(s) commit 6595848a27bd7fb271c63a99551d8326417320dd Author: Sam Ansmink <[email protected]> Date: Mon Nov 10 11:30:34 2025 +0100 bump extensions commit 7a7726214c86267d476a2edbc68656ebd6253fe8 Author: Sam Ansmink <[email protected]> Date: Mon Nov 10 11:28:32 2025 +0100 fix: ci issues commit 4a0f4b0b38b9d5660c8a5c848d8a1c71bc3220de Author: Carlo Piovesan <[email protected]> Date: Mon Nov 10 11:07:58 2025 +0100 Bump httpfs and resume testing on Windows commit 5e9f88863f5f519620ae01f4ff873f6a2869343f Author: Tishj <[email protected]> Date: Mon Nov 10 10:58:03 2025 +0100 conditionally create the PersistentColumnData, if there are no segments (as could be the case for a list's child), there won't be any data pointers commit 95fcb8f18819b1a77df079a7fcb753a8c2f52844 Merge: 396c86228b 4f3df42f20 Author: Laurens Kuiper <[email protected]> Date: Mon Nov 10 10:50:38 2025 +0100 Bump: aws, ducklake, httpfs, iceberg (#19654) This PR bumps the following extensions: - `aws` from `18803d5e55` to `55bf3621fb` - `ducklake` from `2554312f71` to `022cfb1373` - `httpfs` from `8356a90174` to `b80c680f86` - `iceberg` from `5e22d03133` to `db7c01e92` commit c8ddca6f3c32aa0d3a9536371f9e3ca8cb00753e Author: Mytherin <[email protected]> Date: Mon Nov 10 09:19:31 2025 +0100 Fix #19355: correctly resolve subquery in MERGE INTO action condition commit a1eeb0df6ffc2f129638a2dfaab9a70720c8db1b Author: Mytherin <[email protected]> Date: Mon Nov 10 09:00:35 2025 +0100 Fix #19700: correctly sort output selection vector in nested selection operations commit 396c86228bda46929560affde7effdbab7d4e905 Merge: e3d242509e e501fcbd1a Author: Mark <[email protected]> Date: Sat Nov 8 17:34:13 2025 +0100 Add missing query location to blob cast (#19689) commit e3d242509e5710314921a0d7debd0bedb4d10a3e Merge: 7ce99bc041 1ba198d711 Author: Mark <[email protected]> Date: Sat Nov 8 17:34:04 2025 +0100 Add request timing to HTTP log (#19691) Demo: ```SQL D call enable_logging('HTTP'); D from read_csv_auto('s3://duckdblabs-testing/test.csv'); D select request.type, request.url, request.start_time, request.duration_ms from duckdb_logs_parsed('HTTP'); ┌─────────┬────────────────────────────────────────────────────────────────┬───────────────────────────────┬─────────────┐ │ type │ url │ start_time │ duration_ms │ │ varchar │ varchar │ timestamp with time zone │ int64 │ ├─────────┼────────────────────────────────────────────────────────────────┼───────────────────────────────┼─────────────┤ │ HEAD │ https://duckdblabs-testing.s3.us-east-1.amazonaws.com/test.csv │ 2025-11-07 10:17:56.052202+00 │ 417 │ │ GET │ https://duckdblabs-testing.s3.us-east-1.amazonaws.com/test.csv │ 2025-11-07 10:17:56.478847+00 │ 104 │ └─────────┴────────────────────────────────────────────────────────────────┴───────────────────────────────┴─────────────┘ ``` commit ae518d0a4e439f80c768388fab8f51d667f7e4b7 Author: Sam Ansmink <[email protected]> Date: Fri Nov 7 13:58:22 2025 +0100 minor ci fixes commit e501fcbd1af58cf147b80051b38ddf815d5e1b8c Author: Mytherin <[email protected]> Date: Fri Nov 7 12:37:21 2025 +0100 move commit bc1a683d10150dfe15f2f4d69e505f6337c4fc27 Author: Sam Ansmink <[email protected]> Date: Fri Nov 7 11:22:35 2025 +0100 only load httpfs if necessary commit 1ba198d71106a851fb8234ccfb208ec66b0e1d17 Author: Sam Ansmink <[email protected]> Date: Fri Nov 7 11:16:38 2025 +0100 fix: check if logger exists commit f22e9a06ef6e1b6c999e8c7389b05e40ae9032fc Author: Sam Ansmink <[email protected]> Date: Fri Nov 7 11:13:11 2025 +0100 add test for http log timing commit f474ba123485377f94e5b57600fb720733050c98 Author: Sam Ansmink <[email protected]> Date: Fri Nov 7 11:00:19 2025 +0100 add http timings to logger commit 02bb5d19b9fc7a702184ffcf7d9688b88f54071a Author: Mytherin <[email protected]> Date: Fri Nov 7 09:30:04 2025 +0100 Add query location to blob cast commit 7ce99bc04130615dfc3a39dfb79177a8942fefba Merge: 1555b0488e aea843492d Author: Laurens Kuiper <[email protected]> Date: Fri Nov 7 09:22:48 2025 +0100 Fix InsertRelation on attached database (#19583) Fixes https://github.com/duckdb/duckdb/issues/18396 Related PR in duckdb-python: https://github.com/duckdb/duckdb-python/pull/155 commit 1555b0488e322998e6fd06cc47e1909c7bb4eba4 Merge: 783f08ffd8 98e2c4a75f Author: Laurens Kuiper <[email protected]> Date: Fri Nov 7 08:31:05 2025 +0100 Log total probe matches in hash join (#19683) This is usually evident from the number of tuples coming out of a join, but it can be hard to understand what's going on when doing a `LEFT`/`RIGHT`/`OUTER` join. This PR adds one log call at the end of the hash join to report how many probe matches there were. ```sql D CALL enable_logging('PhysicalOperator'); ┌─────────┐ │ Success │ │ boolean │ ├─────────┤ │ 0 rows │ └─────────┘ D SELECT count(*) FROM range(3_000_000) t1(i) LEFT JOIN range(1_000_000, 2_000_000) t2(i) USING (i); ┌────────────────┐ │ count_star() │ │ int64 │ ├────────────────┤ │ 3000000 │ │ (3.00 million) │ └────────────────┘ D CALL disable_logging(); ┌─────────┐ │ Success │ │ boolean │ ├─────────┤ │ 0 rows │ └─────────┘ D SELECT info.total_probe_matches::BIGINT total_probe_matches FROM duckdb_logs_parsed('PhysicalOperator') WHERE class = 'PhysicalHashJoin' AND event = 'GetData'; ┌─────────────────────┐ │ total_probe_matches │ │ int64 │ ├─────────────────────┤ │ 1000000 │ │ (1.00 million) │ └─────────────────────┘ ``` Here we are able to see that the hash join produced 1M matches, but emitted 3M tuples. commit 783f08ffd89b1d1290b2d3dec0b3ba12d8c233bf Merge: 6c6af22ea4 1d5c9f5f3d Author: Laurens Kuiper <[email protected]> Date: Thu Nov 6 15:57:35 2025 +0100 Fixup linking for LLVM (#19668) See conversation at https://github.com/llvm/llvm-project/issues/77653 This allows again: ``` brew install llvm CMAKE_LLVM_PATH=/opt/homebrew/Cellar/llvm/21.1.5 GEN=ninja make ``` to just work. Arguably very limited, but can as well be fixed. commit 6c6af22ea45effc67dc9e76feec3fb73208750bb Merge: 2892abafa7 f483e95d1c Author: Laurens Kuiper <[email protected]> Date: Thu Nov 6 15:56:49 2025 +0100 Categorize ParseLogMessage as CAN_THROW_RUNTIME_ERROR (#19672) Currently we rely on filtering on query type AND executing scalar function `parse_duckdb_log_message` to not be reordered. This is somehow brittle, and have found locally cases where this cause problems that will result in wrong casts, such as: ``` Conversion Error: Type VARCHAR with value 'ColumnDataCheckpointer FinalAnalyze(COMPRESSION_UNCOMPRESSED) result for main.big.0(VALIDITY): 15360' can't be cast to the destination type STRUCT(metric VARCHAR, "value" VARCHAR) ``` Looking at the executed plan, it would look like: ``` ┌─────────────┴─────────────┐ │ FILTER │ │ ──────────────────── │ │ ((type = 'Metrics') AND │ │ (struct_extract │ │ (parse_duckdb_log_message(│ │ 'Metrics', message), │ │ 'metric') = 'CPU_TIME')) │ │ │ │ ~0 rows │ └─────────────┬─────────────┘ ``` Tagging `parse_duckdb_log_message` as potentially throwing on some input avoids reordering, and avoid the problem while improving the usability of logs. An alternative solution would be use explicit DefaultTryCast (instead of TryCast), at https://github.com/duckdb/duckdb/blob/v1.4-andium/src/function/scalar/system/parse_log_message.cpp#L70, either allow to solve the problem. commit 98e2c4a75f816eae6ef2893bbb581c9913293f2a Author: Laurens Kuiper <[email protected]> Date: Thu Nov 6 15:35:25 2025 +0100 log total probe matches in hash join commit 2892abafa772fffc4402e5125cf16a26c094cb44 Merge: ecc73b2b4b 488069ec8d Author: Laurens Kuiper <[email protected]> Date: Thu Nov 6 14:21:05 2025 +0100 duckdb_logs_parsed to do case-insensitive matching (#19669) This is something me and @Tmonster bumped into while helping a customer debugging an issue. I think it's more intuitive and friendly that user facing functions are case insensitive, given that is the general user expectation around SQL. I am not sure `ILIKE` is the best way to do so (an alternative would be filtering on `lower(1) = lower(2)`). Note that passing `%` signs is currently checked elsewhere, for example: ```sql SELECT message FROM duckdb_logs_parsed('query%') WHERE starts_with(message, 'SELECT 1'); ``` would throw ``` Invalid Input Error: structured_log_schema: 'query%' not found ``` (while `querylog` already work, see test case, given there case-insensitivity comparison was already used) commit aea843492da3f40c30e6e88c12eb6da690348f2e Author: Evert Lammerts <[email protected]> Date: Thu Nov 6 11:40:11 2025 +0100 review feedback commit 094a54b890a2466aad743b1c372809849cdef283 Author: Evert Lammerts <[email protected]> Date: Sat Nov 1 11:22:34 2025 +0100 Fix InsertRelation on attached database commit 4f3df42f208d5e6dc602d2e688911ef13758d3aa Author: Sam Ansmink <[email protected]> Date: Thu Nov 6 11:31:58 2025 +0100 bump iceberg further commit f483e95d1c3983c2ba5758ebba1272f7ff12cd0d Author: Carlo Piovesan <[email protected]> Date: Fri Oct 31 12:25:01 2025 +0100 Improve tests using now working FROM duckdb_logs_parsed() commit 6554c84a73b6c7857d2ec5ebf6f2019ceb56e6dc Author: Carlo Piovesan <[email protected]> Date: Tue Nov 4 12:56:31 2025 +0100 parse_logs_message might throw commit 488069ec8d726d3b19093e8d57101c6c6af8910b Author: Carlo Piovesan <[email protected]> Date: Thu Nov 6 09:29:49 2025 +0100 duckdb_logs_parsed to do case-insensitive matching commit 1d5c9f5f3d18c73e27b0bc4353d549680c5c82d5 Author: Carlo Piovesan <[email protected]> Date: Thu Nov 6 09:13:41 2025 +0100 Fixup linking for LLVM See conversation at https://github.com/llvm/llvm-project/issues/77653 commit ecc73b2b4b10beb175968e55e24e69241d00df1b Merge: 2d69f075ee 4cb677238f Author: Mark <[email protected]> Date: Thu Nov 6 08:58:09 2025 +0100 Always remember extra_metadata_blocks when checkpointing (#19639) This is a follow-up to https://github.com/duckdb/duckdb/pull/19588, adding the following: - Reenables block verification in a new test configuration. It further adds new checks to ensure that the metadata blocks that the RowGroup references after checkpointing corresponds to those that it would see if it were to reload them from disk. This verification would have caught the issue addressed by https://github.com/duckdb/duckdb/pull/19588 - Adds a small tweak in `MetadataWriter::SetWrittenPointers`. This ensures that the table writer does not track an `extra_metadata_block` that did not ever receive any writes as part of that rowgroup (as it immediately skipped to next block when calling `writer.GetMetaBlockPointer()` after `writer.StartWritingColumns`). With the added verification, not having this tweak fails e.g. the following test: ``` test/sql/storage/compression/bitpacking/bitpacking_compression_ratio.test_slow CREATE TABLE test_bitpacked AS SELECT i//2::INT64 AS i FROM range(0, 120000000) tbl(i); ================================================================================ TransactionContext Error: Failed to commit: Failed to create checkpoint because of error: Reloading blocks just written does not yield same blocks: Written: {block_id: 2 index: 32 offset: 0}, {block_id: 2 index: 33 offset: 8}, Read: {block_id: 2 index: 33 offset: 8}, Read Detailed: {block_id: 2 index: 33 offset: 8}, Start pointers: {block_id: 2 index: 33 offset: 8}, Metadata blocks: {block_id: 2 index: 32 offset: 0}, ``` - Ensures that we always update `extra_metadata_blocks` after checkpointing a rowgroup. This speeds up subsequent checkpoints significantly. Right now, if you have a large legacy database, and don't update these old rowgroups, this field is kept as is, and every checkpoint needs to recompute it (even if the database isn't reloaded). Making sure we always have `RowGroup::has_metadata_blocks == true` after each checkpoint, even in case of metadata reuse, will both benefit checkpointing for databases in old storage formats, as well as when starting to use newer storage format on large legacy databases. - Only tangentially related to the issue / PR, but while debugging I noticed that the `deletes_is_loaded` variable is not correctly initialized in all RowGroup constructors (can also be triggered with the assertion I added in `RowGroup::HasChanges()`) commit 46028940c8e429739e73f4d345ec3cab5eb5b01c Author: Sam Ansmink <[email protected]> Date: Wed Nov 5 19:33:58 2025 +0100 bump extension entries commit 2d69f075ee91c42ad4fe4208a4d1f06d0034faff Merge: 7043621a83 e3fb2eb884 Author: Laurens Kuiper <[email protected]> Date: Wed Nov 5 15:27:27 2025 +0100 Enable running all extensions tests as part of the build step (#19631) This is enabled via https://github.com/duckdb/extension-ci-tools/pull/278, that introduced a way to hook into running tests for all extension of a given configuration (as opposed to a single one). Also few minor fixes I bumped into: * disable unused platforms from the external extension builds * remove `[persistence]` tests to be always run * enable `vortex` tests * avoid `httpfs` tests on Windows, to be reverted in a follow up commit 4cb677238f7f4ad4d747f1a1045396fd74765724 Merge: b48cd982e0 7043621a83 Author: Yannick Welsch <[email protected]> Date: Wed Nov 5 14:59:47 2025 +0100 Merge remote-tracking branch 'origin/v1.4-andium' into yw/metadata-reuse-tweaks commit b48cd982e0c59a03cf78a37175ba7272438c2525 Author: Yannick Welsch <[email protected]> Date: Wed Nov 5 14:59:34 2025 +0100 newline commit 490411ab5ae614064e3e4fa94f631dcbbeea68d8 Author: Sam Ansmink <[email protected]> Date: Wed Nov 5 13:55:19 2025 +0100 fix: add more places to securely clear key from memory commit e3fb2eb8843f9ff90ad29fd69938ee6961b644dc Author: Carlo Piovesan <[email protected]> Date: Wed Nov 5 11:07:40 2025 +0100 Avoid testing httpfs on Windows (fix incoming) commit e719c837851f016ea614b28380685de8794ccf39 Author: Carlo Piovesan <[email protected]> Date: Wed Nov 5 11:04:57 2025 +0100 Revert "Add ducklake tests" This reverts commit b77a9615117de845fa48463f09be20a89dea7434. commit 4242618a8d43c2004f55b27b63535ad979302e92 Author: Sam Ansmink <[email protected]> Date: Wed Nov 5 11:03:48 2025 +0100 only autoload if crypto util is not set commit 19232fc414dc7f861dcbad788ba5466d10c27a67 Author: Sam Ansmink <[email protected]> Date: Wed Nov 5 10:14:12 2025 +0100 bump extensions commit 7043621a83d1be17ba6b278f0f7a3ec65df98d93 Merge: db845b80c7 3584a93938 Author: Laurens Kuiper <[email protected]> Date: Wed Nov 5 09:18:39 2025 +0100 Bump MySQL scanner (#19643) Updating the MySQL scanner to include the time zone handling fix to duckdb/duckdb-mysql#166. commit db845b80c76452054e26cf7a2d715769592de925 Merge: f50618b48c 7eccc643ae Author: Laurens Kuiper <[email protected]> Date: Wed Nov 5 09:15:52 2025 +0100 Remove `FlushAll` from `DETACH` (#19644) This was initially added to reduce RSS after `DETACH`ing, but it is now creating a large bottleneck for workloads that aggressively `ATTACH`/`DETACH`. RSS will be freed by further allocation activity, or when `SET allocator_background_threads=true;` is enabled. commit 4978ccd8ec15e7631fd9ed741d338da663b0ff48 Author: Sam Ansmink <[email protected]> Date: Tue Nov 4 16:34:16 2025 +0100 fix: add patch file commit 6ec168d508d9395306b29c62cb0b163b6a77bafb Author: Sam Ansmink <[email protected]> Date: Tue Nov 4 16:13:18 2025 +0100 format commit 67ec072c0ea6a237213f680709773e1342b11065 Author: Sam Ansmink <[email protected]> Date: Tue Nov 4 15:59:04 2025 +0100 fix: tests commit 7eccc643ae57a76a49e61b905f9a9a1857a00084 Author: Laurens Kuiper <[email protected]> Date: Tue Nov 4 15:47:29 2025 +0100 remove flush all from detach commit 3584a93938a4852b0510b0c3d6b3bb13861c4147 Author: Alex Kasko <[email protected]> Date: Tue Nov 4 14:33:21 2025 +0000 Bump MySQL scanner Updating the MySQL scanner to include the time zone handling fix to duckdb/duckdb-mysql#166. commit 250b917ed6f423b56efbd855b2359a498fe2ef8d Author: Sam Ansmink <[email protected]> Date: Tue Nov 4 14:41:32 2025 +0100 fix: various issues with encryption commit f50618b48c3dd04f77ae557e3bb4863f96f74a76 Merge: 66100df7ae 8257973295 Author: Mark <[email protected]> Date: Tue Nov 4 14:26:16 2025 +0100 Fix #19455: correctly extract root table in merge into when running ajoin that contains single-sided predicates that are transformed into filters (#19637) Fixes #19455 commit 82579732952d68dec2b2a44cc1ca04243ac57151 Merge: 6efd4a4fde 66100df7ae Author: Mytherin <[email protected]> Date: Tue Nov 4 14:25:42 2025 +0100 Merge branch 'v1.4-andium' into mergeintointernalerror commit 66100df7aeb321d37f2434416df59dc274948987 Merge: d54d36faae c53eb7a562 Author: Mark <[email protected]> Date: Tue Nov 4 14:24:10 2025 +0100 Detect invalid merge into action and throw exception (#19636) `WHEN NOT MATCHED (BY TARGET)` cannot be combined with `DELETE` or `UPDATE`, since there is no rows in the target table to delete or update. This PR ensures we throw an error when this is attempted. commit ca88f5b2cf9480ac8e57f436fbc89d327d19422a Author: Yannick Welsch <[email protected]> Date: Tue Nov 4 10:57:57 2025 +0100 Use reserve instead commit 133a15ee61a64a831de46e4407f38d8bdd7b71f5 Author: Carlo Piovesan <[email protected]> Date: Tue Nov 4 10:45:20 2025 +0100 Move also [persistence] tests back under ENABLE_UNITTEST_CPP_TESTS commit eb322ce251b5c4347650afc455171d862c51bf34 Author: Carlo Piovesan <[email protected]> Date: Tue Nov 4 10:41:40 2025 +0100 Switch from running on PRs wasm_mvp to wasm_eh commit 9c5f82fa358fcf236cff21499351c1e739ca032a Author: Carlo Piovesan <[email protected]> Date: Tue Nov 4 10:40:15 2025 +0100 Currently no external extension works on wasm or windows or musl To be expanded once that changes commit d54d36faae00120f548b39d1e21d93ca25f17087 Merge: 97fdeddb2b c01c994085 Author: Laurens Kuiper <[email protected]> Date: Tue Nov 4 09:03:51 2025 +0100 Bump: spatial (#19620) This PR bumps the following extensions: - `spatial` from `61ede09bec` to `d83faf88cd` commit 6efd4a4fde180bf7d9c433977921818e5465c92a Author: Mytherin <[email protected]> Date: Tue Nov 4 08:13:56 2025 +0100 Fix #19455: correctly extract root table in merge into when running a join that contains single-sided predicates that are transformed into filters commit c53eb7a56266157f0e9d97bd91be0d36285ec38b Author: Mytherin <[email protected]> Date: Tue Nov 4 08:01:24 2025 +0100 Detect invalid merge into action and throw exception commit 97fdeddb2bd5c34862afd30177c9184f51f6dccd Merge: a0a46d6ed0 87193fd5ab Author: Mark <[email protected]> Date: Tue Nov 4 07:48:43 2025 +0100 Try to prevent overshooting of `FILE_SIZE_BYTES` by pre-emptively increasing bytes written in Parquet writer (#19622) Helps with #19552, but doesn't fully fix the problem. We should look into a more robust fix for v1.5.0, but not for a bugfix release commit a0a46d6ed06dd962a4d6eeb01f3e14f8b275cec4 Merge: 73c0d0db15 3838c4a1ed Author: Mark <[email protected]> Date: Tue Nov 4 07:48:27 2025 +0100 Increase cast-cost of old-style implicit cast to string (#19621) This PR fixes https://github.com/duckdb/duckdb-python/issues/148 The issue is that `list_extract` now has two overloads, one for a templated list `LIST<T>` and for concrete `VARCHAR` inputs. When binding a function we add a really high cost to selecting a templated overload to ensure we always pick something more specific if available. With our current casting rules, we are unable to cast `VARHCAR[]` to `VARCHAR`, and therefore fall back to the list-template as expected. But with old-style casting rules we allow `VARCHAR[]` to `VARCHAR` by also adding a high cost penalty, but its still lower than the cost of casting to the template - even though that would be the better alternative. With old-style casting we basically always have a lower-cost "fallback" option than selecting a template overload. While we should overhaul our casting system to evaluate the cast cost along more axes than just "score", this PR fixes this specific case by just cranking up the cost of old-style implicit to-string casts. commit c01c99408526b3c0d698028083481301af069824 Author: Max Gabrielsson <[email protected]> Date: Mon Nov 3 22:42:37 2025 +0100 extension entries commit b77a9615117de845fa48463f09be20a89dea7434 Author: Carlo Piovesan <[email protected]> Date: Mon Nov 3 17:35:39 2025 +0100 Add ducklake tests commit bd58abcdfb4485a1a9dbb750bd0587803fd1c559 Author: Carlo Piovesan <[email protected]> Date: Mon Nov 3 17:35:15 2025 +0100 Load vortex tests commit 62fe1bff77a60fd690b9911aa7a38b7bc197f865 Author: Carlo Piovesan <[email protected]> Date: Mon Nov 3 22:08:43 2025 +0100 Pass down extensions_test_selection -> complete commit e2604e6f5259453f482e0c49ca10520e89ddf269 Author: Yannick Welsch <[email protected]> Date: Mon Nov 3 19:18:47 2025 +0100 Always has_metadata_blocks after checkpoint commit 73c0d0db15621d3d1c2936816becf27e2c41e2ab Merge: 286924e634 b518b2aa0b Author: Mark <[email protected]> Date: Mon Nov 3 18:24:26 2025 +0100 Improve error message around compression type deprecation/availability checks (#19619) This PR fixes https://github.com/duckdblabs/duckdb-internal/issues/6436 The old code kept only a list of "deprecated" types, and returned a boolean, losing the context whether the compression type was available at one point and is now deprecated OR is newly introduced and not available yet in the storage version that is currently used. commit 0e5a33dae35aab5209a8e959cf48d7525fa7ec8d Author: Yannick Welsch <[email protected]> Date: Thu Oct 30 19:28:54 2025 +0100 Verify blocks commit 286924e6348723138ca4dfd55b749d847bce59a9 Merge: 535f905874 c248313a1d Author: Mark <[email protected]> Date: Mon Nov 3 17:12:32 2025 +0100 bump iceberg (#19618) commit 87193fd5abf342d6ddce9d984e69007a4ccdc7d2 Author: Laurens Kuiper <[email protected]> Date: Mon Nov 3 14:43:08 2025 +0100 try to prevent overshooting by pre-emptively increasing write size commit 3838c4a1edd83dc1373b6077dc6ee478bb996e50 Author: Max Gabrielsson <[email protected]> Date: Mon Nov 3 13:55:53 2025 +0100 increase fallback string cast cost commit 535f90587495e0c8f5974a0968b06b15ad01b32e Merge: d643cefe13 06df593c60 Author: Laurens Kuiper <[email protected]> Date: Mon Nov 3 13:49:57 2025 +0100 [DevEx] Improve error message when FROM clause is omitted (#18995) This PR fixes #18954 If the "similar bindings" is entirely empty, that means that there are no bindings, which can only happen if the FROM clause is entirely missing. commit 9268637337a21b9c03fdc7dceb0a88fbbe001a73 Author: Max Gabrielsson <[email protected]> Date: Mon Nov 3 12:35:30 2025 +0100 bump extensions commit d643cefe13de6873f6fb0ecc0bca1c14111cde11 Merge: 5f8cf7d7f8 c6434fd89a Author: Mark <[email protected]> Date: Mon Nov 3 12:28:33 2025 +0100 Avoid eagerly resolving the next on-disk pointer in the MetadataReader, as that pointer might not always be valid (#19588) When enabling the new [experimental metadata re-use](https://github.com/duckdb/duckdb/pull/18395), it is possible for metadata of *some* row groups to be re-used. This can cause linked lists of metadata blocks to contain invalid references. For example, when writing a bunch of row groups, we might get this layout: ``` METADATA BLOCK 1 ROW GROUP 1 ROW GROUP 2 (pt 1) NEXT BLOCK: 2 -> METADATA BLOCK 2 ROW GROUP 2 (pt 2) ROW GROUP 3 ``` Metadata is stored in a linked list (block 1 -> block 2) - but we don't need to traverse this linked list fully. We store pointers to individual row groups, and can start reading from their position. Now suppose we re-use metadata of `ROW GROUP 1`, but not of the other row groups (because e.g. they have been updated / changed). Since this is fully contained in `METADATA BLOCK 1`, we can garbage collect `METADATA BLOCK 2`, leaving the following metadata block: ``` METADATA BLOCK 1 ROW GROUP 1 ROW GROUP 2 (pt 1) NEXT BLOCK: 2 ``` Now we can safely read this block and read the metadata for `ROW GROUP 1`, **however**, this block contains a reference to a metadata block that is no longer valid and might have been garbage collected. This revealed a problem in the `MetadataReader`. In the current implementation of the `MetadataReader` - when pointing it towards a block, it would eagerly try to figure out the metadata location of *the next block*. This is normally not a problem, however, with these invalid chains, we might try to resolve a block that has been freed up already - causing an internal exception to trigger: ``` Failed to load metadata pointer (id %llu, idx %llu, ptr %llu) ``` This PR resolves the issue by making the MetadataReader lazy. Instead of eagerly resolving the next pointer, we only do this when it is actually required. commit b518b2aa0b06372d583fb203f5cae0011a53a87f Author: Tishj <[email protected]> Date: Mon Nov 3 12:24:43 2025 +0100 enum util fix commit 5f8cf7d7f81981f4b2355959257fa82982c3dd11 Merge: 407720a348 2cdc7f922b Author: Laurens Kuiper <[email protected]> Date: Mon Nov 3 12:22:52 2025 +0100 add vortex external extension (#19580) commit 7c2353cb06d867813b7725f893a6b1092821c807 Author: Tishj <[email protected]> Date: Mon Nov 3 11:21:32 2025 +0100 differentiate between deprecated/not available yet in the check, to improve error reporting commit c248313a1dd40f1569b608b80bdec1229de0b6b4 Author: Tmonster <[email protected]> Date: Mon Nov 3 10:54:40 2025 +0100 bump iceberg commit 407720a34804f0da61d5ba6645c3c44ec6ddf0d8 Merge: 7764771eaa d4fb98d454 Author: Mark <[email protected]> Date: Sun Nov 2 15:01:29 2025 +0100 Wal index deletes (#19477) This adds support for buffering and replaying Index delete operations for WAL replay. During WAL replay, index operations are buffered since the Indexes are not bound yet. During Index binding, the buffered operations are applied to the Index. UnboundIndex is modified to support buffering delete operations on top of inserts. BoundIndex::ApplyBufferedAppends is changed to a BoundIndex::ApplyBufferedReplays which supports replaying both inserts and deletes. Documentation along relevant code paths is added which clarifies the ordering of mapped_column_ids and the index_chunks being buffered. Before, the mapping was any order since it was only coming from Index insert paths. Now, buffering can come from both insert and delete paths, so both need to make sure to buffer index chunks and the mappings in the same order, (which is just in sorted order of the physical Index column IDs). There is also a bug fix for buffering index data on a table with generated columns, since the table chunk being created for replaying buffered operations contained all column types previously, including generated columns, whereas now it only contains the physical column layout which is needed for index operations. (ART Index operations take a chunk of data with only the index columns containing any data, and the non-Indexed columns are empty). A catch block is added to Transaction CleanupState::Flush which was silently throwing away any failures (which caught this WAL replay in the first place). Also, some test coverage for ART duplicate rowids and a LookupInLeaf function was added which allows searching for a rowid in a Leaf that is either inlined, or a gate node to a nested ART. @taniabogatsch commit c6434fd89a7391e428f2cb31e6e3d676d5257b0d Author: Mytherin <[email protected]> Date: Sun Nov 2 14:54:33 2025 +0100 Fix lock order inversion commit eb514c01e4ea4ad434fb87fde70307f64992d52a Merge: 2f3d2db509 7764771eaa Author: Mytherin <[email protected]> Date: Sun Nov 2 09:45:34 2025 +0100 Merge branch 'v1.4-andium' into metadatareusefixes commit 7764771eaa654cb44f5c731e99f5d989951aefb8 Merge: 9ea6e07a29 fc2bf610d0 Author: Mark <[email protected]> Date: Sun Nov 2 09:44:54 2025 +0100 Skip compiling remote optimizer test when TSAN Is enabled (#19590) This test uses `fork` which seems to mess up the thread sanitizer, causing strange errors to occur sporadically. commit fc2bf610d0c9851d1e3f6ad273dcfb47b6ec60a6 Author: Mytherin <[email protected]> Date: Sat Nov 1 23:27:43 2025 +0100 Skip compiling entirely commit a68390e2b1a6f09b899d248881d331e5dbbab89a Author: Mytherin <[email protected]> Date: Sat Nov 1 23:23:18 2025 +0100 Skip fork test with tsan commit 2f3d2db50968fd917f253c2c34cf488290dadfa4 Author: Mytherin <[email protected]> Date: Sat Nov 1 15:27:51 2025 +0100 Avoid eagerly resolving the next on-disk pointer in the MetadataReader, as that pointer might not always be valid commit 9ea6e07a290db878c9da097d407b3a866c43c8e0 Merge: 5f1ce8ba5c a740840f97 Author: Mark <[email protected]> Date: Sat Nov 1 09:25:59 2025 +0100 Fix edge case in uncompressed validity scan with offset and fix off-by-one in ArrayColumnData::Select (#19567) This PR fixes a off-by-one in the consecutive-array-scan optimization implemented in https://github.com/duckdb/duckdb/pull/16356 as well as an edge case in our uncompressed validity data scan. Fixes https://github.com/duckdb/duckdb/issues/19377 I can't figure out how to write a test for this, it seems like no matter what I do I'm unable to replicate the same storage characteristics as the database file provided in the issue above. In the repro we do a scan+skip+scan, where part of the first `validity_t` in the second scan contains a bunch of zeroes at the positions "before" the scan window that remain even after shifting. I've solved it by setting all lower bits up to `result_idx` in the first `validity_t` we scan, but not sure if this is the most elegant solution. Strangely enough If we remove all bitwise logic and just do the same "fall-back" logic as ifdef:ed for `VECTOR_SIZE < 128` it all works though, so the issue has to be part of the bit-manipulation. commit 5f1ce8ba5c0000770412b35a763af417f8fb2b90 Merge: be0142d4ee dbe272dff0 Author: Mark <[email protected]> Date: Sat Nov 1 09:22:00 2025 +0100 [v1.4-andium] Add Profiler output to logger interface (#19572) This is https://github.com/duckdb/duckdb/pull/19546 backported to `v1.4-andium` branch, see conversation there. --- Idea is: if both profiler and logger are enabled, then you can access profiler output also via logger. This is on top / independent of the current choices for where to output the profiler (JSON / graphviz / query-tree / ...). While this might be somewhat wasteful, it's allow for an easier PR and leave unopinionated what should the SQL interface be. Also given ToLog() call is inexpensive (in particular if the logger is disabled), and that it's unclear if logger alone can satisfy profiler necessities, I think going additive is the best path here. Demo: ```sql ATTACH 'my_db.db'; USE my_db; ---- enable profiling to json file PRAGMA profiling_output = 'profiling_output.json'; PRAGMA enable_profiling = 'json'; ---- enable logging (to in-memory table) call enable_logging(); ---- CREATE TABLE small AS FROM range(100); CREATE TABLE medium AS FROM range(10000); CREATE TABLE big AS FROM range(1000000); PRAGMA disable_profiling; SELECT query_id, type, metric, value FROM duckdb_logs_parsed('Metrics') WHERE metric == 'CPU_TIME'; ``` Will result in for example in: ``` ┌──────────┬─────────┬──────────┬───────────────────────┐ │ query_id │ type │ metric │ value │ │ uint64 │ varchar │ varchar │ varchar │ ├──────────┼─────────┼──────────┼───────────────────────┤ │ 10 │ Metrics │ CPU_TIME │ 8.1041e-05 │ │ 11 │ Metrics │ CPU_TIME │ 0.0002499510000000001 │ │ 12 │ Metrics │ CPU_TIME │ 0.02776677799999981 │ └──────────┴─────────┴──────────┴───────────────────────┘ ``` A more complex example would be for example: With the duckdb cli, execute: ```sql PRAGMA profiling_output = 'metrics_folder/tmp_profiling_output.json'; PRAGMA enable_profiling = 'json'; CALL enable_logging(storage='file', storage_path='./metrics_folder'); --- arbitrary queryies CREATE TABLE small AS FROM range(100); CREATE TABLE medium AS FROM range(10000); CREATE TABLE big AS FROM range(1000000); ``` then close, restart duckdb cli, and query what's persisted in the `metric_folder` folder: ```sql PRAGMA disable_profiling; CALL enable_logging(storage='file', storage_path='./metrics_folder'); SELECT queries.message, metrics.metric, TRY_CAST(metrics.value AS DOUBLE) as value FROM duckdb_logs_parsed('QueryLog') queries, duckdb_logs_parsed('Metrics') metrics WHERE queries.query_id = metrics.query_id AND metrics.metric = 'CPU_TIME';``` ``` ``` ┌─────────────────────────────────────────────┬──────────┬─────────────────────────────────────┐ │ message │ metric │ TRY_CAST(metrics."value" AS DOUBLE) │ │ varchar │ varchar │ double │ ├─────────────────────────────────────────────┼──────────┼─────────────────────────────────────┤ │ CREATE TABLE small AS FROM range(100); │ CPU_TIME │ 8.1041e-05 │ │ CREATE TABLE medium AS FROM range(10000); │ CPU_TIME │ 0.0002499510000000001 │ │ CREATE TABLE big AS FROM range(1000000); │ CPU_TIME │ 0.02776677799999981 │ └─────────────────────────────────────────────┴──────────┴─────────────────────────────────────┘ ``` commit be0142d4ee0385262520ae2488e8dd11ac213735 Merge: b68a1696de 7df4151c0d Author: Mark <[email protected]> Date: Sat Nov 1 09:21:19 2025 +0100 fix inconsistent behavior in remote read_file/blob, and prevent union… (#19531) Closes https://github.com/duckdb/duckdb-fuzzer/issues/4208 Closes https://github.com/duckdb/duckdb/issues/19090 Our remote filesystem doesn't actually check that files exist when "globbing" a non-glob pattern. Now we check that the file exists in the read_blob/text function even if we just access the file name. Diff is a bit bigger cause I also moved a bunch of templated stuff into the cpp file. commit 06df593c60bb22973642d776c1c3c3aca85ee0d6 Author: Tishj <[email protected]> Date: Fri Oct 31 15:26:18 2025 +0100 fix up tests commit 2cdc7f922bde5550aa1ecd24dabf23b05fbf202b Author: Sam Ansmink <[email protected]> Date: Fri Oct 31 15:10:31 2025 +0100 add vortex external extension commit b68a1696de1a603b59e39efc25da7fc2826a3135 Merge: 8169d4f15c 9414882f7f Author: Mark <[email protected]> Date: Fri Oct 31 13:45:16 2025 +0100 Release relevant tests to still be run on all builds (#19559) I would propose, at least for the Linux builds, to add back a minimal amount of tests also on release builds. They will ensure at a minimum that: * for a given release, the corresponding storage_version is valid * for a minor release, that the corresponding name has been set There are more tests that we might consider basic enough AND connected to behaviour specific of a release that we might want to add to the `release` tag. Fixes https://github.com/duckdb/duckdb/issues/19354 (together with https://github.com/duckdb/duckdb/pull/19525 that actually added the name). Note that given the current release process happens in advance, eventual test failure are annoying but not fatal, but they will require changes to code. I am not sure if it's worth having a `keep_going_in_all_cases` option, basically turning the boolean into a set, but I think it can be done when need arise. commit 8169d4f15cf556d0ca0ec68d9c876c2bb84aae09 Merge: d9028d09d5 6e2c195859 Author: Mark <[email protected]> Date: Fri Oct 31 13:44:30 2025 +0100 Fix race condition between `Append` and `Scan` (#19571) Update `ColumnData::count` only after actually `Append` the data to avoid Race Condition with `Scan`. See https://github.com/duckdb/duckdb/issues/19570 for details. commit d4fb98d45409bcaaf8c3030c7aa7e40b1f60b9d1 Merge: 0743b590d3 d9028d09d5 Author: Artjom Plaunov <[email protected]> Date: Fri Oct 31 11:16:23 2025 +0100 Merge remote-tracking branch 'upstream/v1.4-andium' into wal-index-deletes commit 0743b590d361041cc167f0634250f78c20f4d332 Author: Artjom Plaunov <[email protected]> Date: Fri Oct 31 11:15:04 2025 +0100 remove C++ test, add extra interleaved index replay SQL test commit 5ca334715faa6c871c8e96029c142aacf53969a7 Author: Tishj <[email protected]> Date: Fri Oct 31 10:43:03 2025 +0100 fix up tests commit 0a6b5fb4919a8092b38e19051a9286eeaaeb392c Merge: 0b1f0e320a d9028d09d5 Author: Tishj <[email protected]> Date: Fri Oct 31 10:38:56 2025 +0100 Merge branch 'v1.4-andium' into missing_from_clause_better_error commit a740840f9772a1702a5ffeec43694c48be3526c5 Author: Max Gabrielsson <[email protected]> Date: Thu Oct 30 18:04:39 2025 +0100 fix consecutive array range calculation, fix validity scanning when bits before result offset are null commit 6e2c195859a496f1f98c20fd887fac944ba0e344 Author: zhangxizhe <[email protected]> Date: Fri Oct 31 13:43:19 2025 +0800 Update `ColumnData::count` only after actually `Append` the data to avoid Race Condition with `Scan`. See `issue #19570` for details. commit d9028d09d56640599dd8307dd9ae6c8837267e9f Merge: 307f9b41ff 6bc51dd58e Author: Laurens Kuiper <[email protected]> Date: Fri Oct 31 08:47:10 2025 +0100 Disable jemalloc on BSD (#19560) Fixes https://github.com/duckdb/duckdb/issues/14363 commit dbe272dff0a63d0d01269cee05945a0b016d219f Author: Carlo Piovesan <[email protected]> Date: Wed Oct 29 23:51:42 2025 +0100 Add Profiler output to logger interface Idea is: if both profiler and logger are enabled, then you can access profiler output also via logger. This is on top / independent of the current choices for where to output the profiler (JSON / graphviz / query-tree / ...). While this might be somewhat wasteful, it's allow for an easier PR and leave unopinionated what should the SQL interface be. Also given ToLog() call is inexpensive (in particular if the logger is disabled), and that it's unclear if logger alone can satisfy profiler necessities, I think going additive is the best path here. Demo: ```sql ATTACH 'my_db.db'; USE my_db; ---- enable profiling to json file PRAGMA profiling_output = 'profiling_output.json'; PRAGMA enable_profiling = 'json'; ---- enable logging (to in-memory table) call enable_logging(); ---- CREATE TABLE small AS FROM range(1000); CREATE TABLE medium AS FROM range(1000000); CREATE TABLE big AS FROM range(1000000000); PRAGMA disable_profiling; SELECT * EXCLUDE timestamp FROM duckdb_logs() WHERE type == 'Metrics' ORDER BY message.split(',')[1], context_id; ``` Will result in for example in: ``` ┌────────────┬─────────┬───────────┬────────────────────────────────────────────────────────────┐ │ context_id │ type │ log_level │ message │ │ uint64 │ varchar │ varchar │ varchar │ ├────────────┼─────────┼───────────┼────────────────────────────────────────────────────────────┤ │ 39 │ Metrics │ INFO │ {'metric': CHECKPOINT_LATENCY, 'value': 0.0} │ │ 44 │ Metrics │ INFO │ {'metric': CHECKPOINT_LATENCY, 'value': 0.0} │ │ 49 │ Metrics │ INFO │ {'metric': CHECKPOINT_LATENCY, 'value': 0.017832} │ │ 39 │ Metrics │ INFO │ {'metric': COMMIT_WRITE_WAL_LATENCY, 'value': 0.000305292} │ │ 44 │ Metrics │ INFO │ {'metric': COMMIT_WRITE_WAL_LATENCY, 'value': 0.003793958} │ │ 49 │ Metrics │ INFO │ {'metric': COMMIT_WRITE_WAL_LATENCY, 'value': 0.0} │ │ 39 │ Metrics │ INFO │ {'metric': CPU_TIME, 'value': 0.000110209} │ │ 44 │ Metrics │ INFO │ {'metric': CPU_TIME, 'value': 0.009471759999999997} │ │ 49 │ Metrics │ INFO │ {'metric': CPU_TIME, 'value': 8.241736770029297} │ │ · │ · │ · │ · │ │ · │ · │ · │ · │ │ · │ · │ · │ · │ │ 39 │ Metrics │ INFO │ {'metric': SYSTEM_PEAK_BUFFER_MEMORY, 'value': 36864} │ │ 44 │ Metrics │ INFO │ {'metric': SYSTEM_PEAK_BUFFER_MEMORY, 'value': 6625280} │ │ 49 │ Metrics │ INFO │ {'metric': SYSTEM_PEAK_BUFFER_MEMORY, 'value': 63510528} │ │ 39 │ Metrics │ INFO │ {'metric': TOTAL_BYTES_WRITTEN, 'value': 0} │ │ 44 │ Metrics │ INFO │ {'metric': TOTAL_BYTES_WRITTEN, 'value': 262144} │ │ 49 │ Metrics │ INFO │ {'metric': TOTAL_BYTES_WRITTEN, 'value': 12587008} │ │ · │ · │ · │ · │ │ · │ · │ · │ · │ │ · │ · │ · │ · │ ├────────────┴─────────┴───────────┴────────────────────────────────────────────────────────────┤ │ 57 rows (? shown) 4 columns │ └───────────────────────────────────────────────────────────────────────────────────────────────┘ ``` commit 307f9b41ff0464dba0e0f2504c75747c7ead2ecc Merge: 1cba2e741b 08bf725300 Author: Mark <[email protected]> Date: Thu Oct 30 15:03:25 2025 +0100 [ported from main] Fix bug initializing std::vector for column names (#19555) This 4 line fix was merged with main in #19444. It should be in v1.4-andium as well so that it makes it into v1.4.2. commit 1cba2e741b6622f5be156c061478a6fa66c0f819 Merge: ecb6bfe5b4 80554e4d59 Author: Laurens Kuiper <[email protected]> Date: Thu Oct 30 14:47:58 2025 +0100 Bugfixes: Parquet JSON+DELTA_LENGTH_BYTE_ARRAY and sorting iterator (#19556) This PR fixes an issue introduced in v1.4.1 with the Parquet reader when combining a `JSON` column with `DELTA_LENGTH_BYTE_ARRAY` encoding. The issue was caused by trying to validate an entire block of strings in one go, which is OK for UTF-8, but for JSON. This PR makes it so we validate individual strings if the column has `JSON` type. Fixes https://github.com/duckdb/duckdb/issues/19366 This PR also fixes an issue with the new sorting code, which had an error in the calculation of subtraction under modulo. I've fixed this, and unified the code for `InMemoryBlockIteratorState` and `ExternalBlockIteratorState` with some templating, so now the erroneous calculation should be gone from both state types. Fixes https://github.com/duckdb/duckdb/issues/19498 commit 9414882f7fc81be58af0ec914cbe8c6045af3517 Author: Carlo Piovesan <[email protected]> Date: Thu Oct 30 12:39:48 2025 +0100 Allow back basics tests also in release mode commit 2987acd0d19656e583f30447a91852793ef188f7 Author: Carlo Piovesan <[email protected]> Date: Thu Oct 30 12:36:32 2025 +0100 Add test on codename being registered, and tag it as release commit 6bc51dd58edaf76725810b595a5300044749c0cf Author: Laurens Kuiper <[email protected]> Date: Thu Oct 30 13:24:45 2025 +0100 disable jemalloc BSD commit 80554e4d592ec793676a80b180469a572a247f2a Merge: 5974ef8c03 ecb6bfe5b4 Author: Laurens Kuiper <[email protected]> Date: Thu Oct 30 09:57:58 2025 +0100 Merge branch 'v1.4-andium' into bugfixes_v1.4 commit 08bf725300335d34f05cd6f6f508f78ef57c477b Author: Curt Hagenlocher <[email protected]> Date: Fri Oct 17 14:08:52 2025 -0700 Fix bug initializing std::vector for column names commit ecb6bfe5b483ffd1a2a490275b48ec91501680c4 Merge: 09a36d2f73 94471b8e04 Author: Hannes Mühleisen <[email protected]> Date: Thu Oct 30 09:01:41 2025 +0200 Follow up to staging move (#19551) Follow up to #19539, CF does not like AWS regions commit 94471b8e0472a2507623b2408808156f6ddde764 Author: Hannes Mühleisen <[email protected]> Date: Thu Oct 30 07:49:34 2025 +0200 this region does not exist in cf commit 09a36d2f73d1b2f93682e315761bb3c4973f8ac9 Merge: a23f54fb54 c2a4fc29dc Author: Mark <[email protected]> Date: Wed Oct 29 21:51:05 2025 +0100 [Dev] Disable the use of `ZSTD` if the block_manager is the `InMemoryBlockManager` (#19543) This PR fixes https://github.com/duckdblabs/duckdb-internal/issues/6319 This has to be done because the InMemoryBlockManager doesnt support GetFreeBlockId, which is required by the zstd compression method. I couldn't produce a test for this because I can't reproduce the problem in the unittester, only in the CLI (I assume the storage version prevents in-memory compression???) commit c2a4fc29dceb617c80ab9156d84f2320add29542 Author: Tishj <[email protected]> Date: Wed Oct 29 16:37:20 2025 +0100 add test for disabled zstd compression in memory commit 5974ef8c03afcd01df670a42dd7be0bbb2a6c6ff Author: Laurens Kuiper <[email protected]> Date: Wed Oct 29 16:34:54 2025 +0100 properly set file paht in test commit a35ba26f267eca2fb144e07b14706af2b96270a8 Author: Tishj <[email protected]> Date: Wed Oct 29 15:19:03 2025 +0100 disable the use of ZSTD if the block_manager is the InMemoryBlockManager, since it doesnt support GetFreeBlockId commit fd85508aa0065a18180a6f9af1d4c66842b28964 Author: Laurens Kuiper <[email protected]> Date: Wed Oct 29 15:08:06 2025 +0100 re-add missing initialization commit a23f54fb54c686614cdaf547778b4c6f47bcbf5c Merge: f2e48a73d4 ab586dfaf6 Author: Hannes Mühleisen <[email protected]> Date: Wed Oct 29 14:52:40 2025 +0200 Creating separate OSX cli binaries for each arch (#19538) Also no longer adding the shared library three times because of symlinks commit f2e48a73d42ce538706529e51aec54cfd9f96d84 Merge: 5a6521ca7e ccefe12386 Author: Hannes Mühleisen <[email protected]> Date: Wed Oct 29 14:51:26 2025 +0200 Moving staging to cf and uploading to install bucket (#19539) This adds a custom endpoint for staging uploads so we can move to R2 for this. We also add functionality to upload to the R2 bucket behind `install.duckdb.org`. Once merged, I will update/add the following secrets: - `S3_DUCKDB_STAGING_ENDPOINT` - `S3_DUCKDB_STAGING_ID` - `S3_DUCKDB_STAGING_KEY` - `DUCKDB_INSTALL_S3_ENDPOINT` - `DUCKDB_INSTALL_S3_ID` - `DUCKDB_INSTALL_S3_SECRET` commit f5bc9796be79b602ed1892484e060f0e79083610 Author: Laurens Kuiper <[email protected]> Date: Wed Oct 29 13:43:05 2025 +0100 nicer templating and less code duplication commit ccefe12386007dd65fae1fe3ff1d65bcb45df44d Author: Hannes Mühleisen <[email protected]> Date: Wed Oct 29 14:18:15 2025 +0200 Update .github/workflows/StagedUpload.yml Co-authored-by: Carlo Piovesan <[email protected]> commit 41fc70ae3312599e425d140f7db770f56c2c5c38 Author: Hannes Mühleisen <[email protected]> Date: Wed Oct 29 14:00:41 2025 +0200 Update .github/workflows/StagedUpload.yml Co-authored-by: Carlo Piovesan <[email protected]> commit e8c2d9401b580c64ef5d3cad3cb8d301375ddbd3 Author: Hannes Mühleisen <[email protected]> Date: Wed Oct 29 12:35:30 2025 +0200 moving staging to cf and uploading to install bucket commit 7df4151c0d4967e2dd33eff7f426805df3c56442 Author: Max Gabrielsson <[email protected]> Date: Wed Oct 29 10:58:22 2025 +0100 remove named parameters commit ab586dfaf6bf58fa8376944e599c51efea462cb8 Author: Hannes Mühleisen <[email protected]> Date: Wed Oct 29 11:46:18 2025 +0200 creating separate osx cli binaries for each arch commit 8f30296d7c05c277771bf1fe95b73fafe7fa9d0f Merge: 5dac9f7504 5a6521ca7e Author: Artjom Plaunov <[email protected]> Date: Wed Oct 29 09:39:30 2025 +0100 Merge remote-tracking branch 'upstream/v1.4-andium' into wal-index-deletes commit 5a6521ca7e744205e4c3b67cab8708e2df87073b Merge: 8c7210f9b0 601d68526c Author: Mark <[email protected]> Date: Wed Oct 29 07:55:06 2025 +0100 Add test that either 'latest' or 'vX.Y.Z' are supported STORAGE_VERSIONs (#19527) Connected to https://github.com/duckdb/duckdb/pull/19525, adds a test that would have triggered. That test is not build when actually building releases, so that's not fool-proof, but I think adding this in is helpful. Tested locally to behave as intended both on dev commit (success) and tag (fails, fixed via linked PR). commit 8c7210f9b0270517e1dba11502dc196a3f0cb13c Merge: 7b5c16f2d5 99f26bde2d Author: Mark <[email protected]> Date: Tue Oct 28 18:58:35 2025 +0100 add upcoming patch release to internal versions (#19525) commit 7b5c16f2d51dda602c9ddfed58d71bb6ae3275a0 Merge: 23228babba 295603915b Author: Mark <[email protected]> Date: Tue Oct 28 18:58:16 2025 +0100 Bump multiple extensions (#19522) This PR bumps the following extensions: - `avro` from `7b75062f63` to `93da8a19b4` - `delta` from `03aaf0f073` to `0747c23791` - `ducklake` from `f134ad86f2` to `2554312f71` - `iceberg` from `4f3c5499e5` to `30a2c66f10` - `spatial` from `a6a607fe3a` to `61ede09bec` commit 23228babba519ec70b183b03ea6bc4457b3ed84c Merge: 71a64b5ab4 6a38ac0f69 Author: Mark <[email protected]> Date: Tue Oct 28 18:58:00 2025 +0100 Bump: inet (#19526) This PR bumps the following extensions: - `inet` from `f6a2a14f06` to `fe7f60bb60 (patches removed: 1)` commit 067d6eb0d5c56270f1d24951966191d9c12c3008 Author: Max Gabrielsson <[email protected]> Date: Tue Oct 28 17:33:43 2025 +0100 fix inconsistent behavior in remote read_file/blob, and prevent union_by_name from crashing commit 601d68526c9e616ff08a0e08d949f00dcfb76060 Author: Carlo Piovesan <[email protected]> Date: Tue Oct 28 13:11:45 2025 +0100 Add test that either 'latest' or 'vX.Y.Z' are supported STORAGE_VERSIONs commit c63c5060d01340dc11f39349bf7950fb8eaa455b Author: Laurens Kuiper <[email protected]> Date: Tue Oct 28 15:55:12 2025 +0100 fix #19498 commit 7e52dc5a75532c5413088fbb9f90e6a30f9e5d14 Author: Laurens Kuiper <[email protected]> Date: Tue Oct 28 15:54:56 2025 +0100 add missing test commit 71a64b5ab4005fd2eb63cb3912403fde29f4d7e0 Merge: 76ee047ce4 3856fa8ea8 Author: Mark <[email protected]> Date: Tue Oct 28 14:30:18 2025 +0100 Support non-standard NULL in Parquet again (#19523) https://github.com/duckdb/duckdb/pull/19406 removed support for the non-standard NULL by adding the safe enum casts. Support for this was explicitly added in https://github.com/duckdb/duckdb/pull/11774 We could consider removing support for this - but it shouldn't be done as part of a bug-fix release imo. This also currently breaks merging v1.4 -> main. commit 05fb1249cab3404bc396ccaee0cdb1959ae11481 Author: Laurens Kuiper <[email protected]> Date: Tue Oct 28 14:19:50 2025 +0100 fix #19366 commit 5dac9f750490e1ea601b03d8e3d11db7a9cc0197 Merge: 0d4a78c90f 76ee047ce4 Author: Artjom Plaunov <[email protected]> Date: Tue Oct 28 13:14:30 2025 +0100 Merge remote-tracking branch 'upstream/v1.4-andium' into wal-index-deletes commit 0d4a78c90f6288abe842afab521ba1e7a075307f Author: Artjom Plaunov <[email protected]> Date: Tue Oct 28 13:12:44 2025 +0100 remove int types commit 6a38ac0f699f2f85adda33d61c94c6ec054d89ca Author: Sam Ansmink <[email protected]> Date: Tue Oct 28 13:08:40 2025 +0100 bump extensions commit 3cd616b89657c5489844d8a76d26169554e5af96 Author: Artjom Plaunov <[email protected]> Date: Tue Oct 28 12:57:05 2025 +0100 PR review fixes + more C++ test coverage commit 0fde0c573099c317b0710ed42d87864ee4b75c00 Merge: baa522991e 76ee047ce4 Author: Laurens Kuiper <[email protected]> Date: Tue Oct 28 12:32:44 2025 +0100 Merge branch 'v1.4-andium' into bugfixes_v1.4 commit 99f26bde2d03e9958ac4bd37f5f8a0ac67b2fcd3 Author: Sam Ansmink <[email protected]> Date: Tue Oct 28 12:07:39 2025 +0100 add upcoming patch release to internal versions commit 3856fa8ea82bd8b9c11166102aab602ddf165ee2 Author: Mytherin <[email protected]> Date: Tue Oct 28 11:19:35 2025 +0100 Support non-standard NULL in Parquet again commit 295603915b0ab3a1532cbbe6cf9547f9803e3c46 Author: Sam Ansmink <[email protected]> Date: Tue Oct 28 10:58:22 2025 +0100 bump extensions commit c1d826f2523bd8454426ad7401665e8e69f9dadc Author: Artjom Plaunov <[email protected]> Date: Tue Oct 28 08:55:00 2025 +0100 unnamed name space commit 76ee047ce45bab9472068ea360f9894a3a456a83 Merge: b62b03c4b3 bd3eb153b1 Author: Laurens Kuiper <[email protected]> Date: Tue Oct 28 08:34:42 2025 +0100 Make `DatabaseInstance::…
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
fixes duckdb/duckdb#18396
related pr in core: duckdb/duckdb#19583
The checks of this PR can only run after duckdb/duckdb#19583 lands