-
Notifications
You must be signed in to change notification settings - Fork 4k
GH-34078: [C++][Parquet] Minor API improvements for BloomFilter #33995
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
@pitrou @wjones127 PTAL |
cpp/src/parquet/bloom_filter.h
Outdated
| // Maximum Bloom filter size, it sets to HDFS default block size 128MB | ||
| // This value will be reconsidered when implementing Bloom filter producer. | ||
| static constexpr uint32_t kMaximumBloomFilterBytes = 128 * 1024 * 1024; | ||
| static constexpr uint64_t kMaximumBloomFilterBytes = 128 * 1024 * 1024; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should the return type of OptimalNumOfBits be changed to uint64_t as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently just keep it uint32_t is ok, since it could support 4G. For Bits, its 1G Bytes, and it's 8 times greater than kMaximumBloomFilterBytes.
|
Could you open a new issue for this? |
|
@mapleFU Can you take a look at the CI failures? Thanks! |
|
That's my fault. I didn't add |
c6144bc to
52f1281
Compare
|
No idea why these two tests still failed... The error messages are to confusing for me... |
|
Oh, seems #34038 cause the error... Let's waiting it to be merged... |
|
@pitrou Seems all tests passed, mind take a look? |
|
|
|
|
|
I've taken the issue now |
|
Benchmark runs are scheduled for baseline = 9cb6fd6 and contender = d512dd2. d512dd2 is a master commit associated with this PR. Results will be available as each benchmark for each run completes. |
Rationale for this change
OptimalNumOfBytes, becauseOptimalNumOfBitsis confusing...MemoryPoolas input argumentWhat changes are included in this PR?
Are these changes tested?
They're already tested...
Are there any user-facing changes?
No. (But user may misuse
BloomFilter::Initpreviously)