Avoid direct usage of DictionaryBlock constructor#17842
Avoid direct usage of DictionaryBlock constructor#17842raunaqmorarka wants to merge 1 commit intotrinodb:masterfrom
Conversation
There was a problem hiding this comment.
this constructor is used in multiple places inside DictionaryBlok e.g. copyPositions that can make the dictionary to be RLE.
If we really want to protect ourselves from this case we need a precondition inside this constructor
There was a problem hiding this comment.
ah right, I missed the usages within this class, we need some more changes
There was a problem hiding this comment.
After looking closer at the other usages, the only way we found to make RLE dictionary out of non-RLE dictionary through get/copyPositions is to use a BlockBuilder as the dictionary. Although it's technically allowed, I'm not sure we need to worry about that. It would be odd to create a DictionaryBlock with a BlockBuilder as the dictionary rather than a pre-built Block.
This ensures that dictionaries with RLE or another dictionary block inside them are unwrapped in all code paths
3ebbc4c to
c4f9f32
Compare
lukasz-stec
left a comment
There was a problem hiding this comment.
It would be better for all DictionaryBlock construction to go through the createInternal method but this is step into that direction
| Page output = lookupJoinPageBuilder.build(probe); | ||
| assertEquals(output.getChannelCount(), 2); | ||
| assertTrue(output.getBlock(0) instanceof DictionaryBlock); | ||
| assertTrue(output.getBlock(0) instanceof LongArrayBlock); |
|
The changes look ok to me, although the test failures seem related and suggest that something subtle is going on. However, the commit message isn't accurate because |
sopel39
left a comment
There was a problem hiding this comment.
lgtm % there are test failures
| Block block = DictionaryBlock.create(1, 2, dictionary, new int[] {1, 3, 6, 8}); | ||
| assertThat(block).isInstanceOf(DictionaryBlock.class); | ||
| assertThat(block.getPositionCount()).isEqualTo(2); | ||
| assertThat(block.getSlice(0, 0, block.getSliceLength(0))).isEqualTo(expectedValues[3]); |
| JoinProbe probe = joinProbeFactory.createJoinProbe(page); | ||
| Page output = lookupJoinPageBuilder.build(probe); | ||
| assertEquals(output.getChannelCount(), 2); | ||
| assertTrue(output.getBlock(0) instanceof DictionaryBlock); |
| ColumnarRow columnarRow = toColumnarRow(rowIds); | ||
| checkArgument(!columnarRow.mayHaveNull(), "The rowIdsRowBlock may not have null rows"); | ||
| int positionCount = rowIds.getPositionCount(); | ||
| if (positionCount == 0) { |
There was a problem hiding this comment.
move before toColumnarRow(rowIds).
| if (positionCount == 0) { | ||
| return EMPTY_PAGE; | ||
| } | ||
| checkArgument(!columnarRow.mayHaveNull(), "The rowIdsRowBlock may not have null rows"); |
dain
left a comment
There was a problem hiding this comment.
I have a full rewrite blocks like these queued up behind the group by hash PR. In the rewrite I introduce the concept of a value block, all blocks other than dictionary, rle, and lazy. Then I update dictionary and RLE to only be able to contain value blocks, and then I change most of the code base to unwrap non-value blocks before invoking functions (on the outside of the loops).
I suggest we don't merge these changes, as they will just conflict and get undone by the changes in my branch.
Sure, I had put this aside anyway. There were some unexplained test failures which I didn't have the bandwidth to look into and the original problem we wanted to solve was tacked by #17844 |
Description
This ensures that dictionaries with RLE or another dictionary block inside them are unwrapped in all code paths
Additional context and related issues
Release notes
(x) This is not user-visible or docs only and no release notes are required.
( ) Release notes are required, please propose a release note for me.
( ) Release notes are required, with the following suggested text: