Fix "No bucket node map" failure when inserting into Iceberg table#14003
Fix "No bucket node map" failure when inserting into Iceberg table#14003ebyhr wants to merge 1 commit intotrinodb:masterfrom
Conversation
|
Looking CI failure. |
| } | ||
|
|
||
| @Test | ||
| public void testInsertIntoBucketedColumnWhenTaskWriterCountIsGreaterThanOrEqualToNodeCount() |
There was a problem hiding this comment.
You're testing the greater than case (or equal case), not both
| int taskWriterCount = 4; | ||
| assertThat(taskWriterCount).isGreaterThanOrEqualTo(getQueryRunner().getNodeCount()); |
There was a problem hiding this comment.
Be explicit which situation you're testing (equal, or greater than)
| int taskWriterCount = 4; | ||
| assertThat(taskWriterCount).isGreaterThanOrEqualTo(getQueryRunner().getNodeCount()); | ||
| Session session = Session.builder(getSession()) | ||
| .setSystemProperty("task_writer_count", String.valueOf(taskWriterCount)) |
There was a problem hiding this comment.
TASK_WRITER_COUNT is a public constant, you can use it here
| .setSystemProperty("task_writer_count", String.valueOf(taskWriterCount)) | ||
| .build(); | ||
|
|
||
| String tableName = "test_inserting_into_bucketed_column_when_task_writer_count_is_greater_than_or_equal_to_node_count_" + randomTableSuffix(); |
| String tableName = "test_inserting_into_bucketed_column_when_task_writer_count_is_greater_than_or_equal_to_node_count_" + randomTableSuffix(); | ||
| assertUpdate("CREATE TABLE " + tableName + " (bucketed_col INT) WITH (partitioning = ARRAY['bucket(bucketed_col, 10)'])"); | ||
|
|
||
| assertUpdate(session, "INSERT INTO " + tableName + " VALUES (1)", 1); |
There was a problem hiding this comment.
INSERT nationkey SELECT FROM tpch.tiny.nation
otherwise the planner could realize it's inserting exactly one row, and could limit writer count to 1 without talking to the connector.
There was a problem hiding this comment.
Also, worth adding cases with CTAS, UPDATE, DELETE and MERGE
| return Optional.empty(); | ||
| } | ||
|
|
||
| return Optional.of(createBucketNodeMap(nodeManager.getRequiredWorkerNodes().size())); |
There was a problem hiding this comment.
I think I don't understand the change.
Is ConnectorNodePartitioningProvider.getBucketNodeMapping mandatory to implement?
@electrum 's 3207925 (part of #7933) suggests it should be optional to implement this method.
If it's optional, do we have a bug in the engine, which manifests only when this method is not implemented?
If so, shouldn't we have a fix in the engine?
There was a problem hiding this comment.
This should be a bug in the engine. I believe this implementation will break MERGE.
There was a problem hiding this comment.
Implementing the method in this way causes MERGE to fail with
Insert and update layout have mismatched BucketNodeMap
Which is why we made it optional to implement this method. We need to track down why the task_writer_count causes the query to fail. None of the existing integration tests caught this case.
There was a problem hiding this comment.
Sure, I can look at this. Thanks for writing the test, it's helpful.
| assertQuery("SELECT * FROM " + tableName, "VALUES 1"); | ||
|
|
||
| assertUpdate("DROP TABLE " + tableName); | ||
| } |
There was a problem hiding this comment.
BTW the problem doesn't look Iceberg specific. Should this be part of BCT?
There was a problem hiding this comment.
Do we have a way of creating bucketed tables in BCT? I think only Hive and Iceberg would support this.
There was a problem hiding this comment.
Is the problem about bucketed tables only? Are "plainly partitioned" tables not affected?
Anyway, i hear you on the test setup challenge, making it hard to test in BCT.
Description
Fixes #13960
Documentation
(x) No documentation is needed.
Release notes
(x) Release notes entries required with the following suggested text: