-
Notifications
You must be signed in to change notification settings - Fork 3.5k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
PARQUET-1508: [C++] Read ByteArray data directly into arrow::BinaryBu…
…ilder and BinaryDictionaryBuilder. Refactor encoders/decoders to use cleaner virtual interfaces This patch ended up being a bit of a bloodbath, but it sorted out a number of technical debt problems. Summary: * Add type-specific virtual encoder interfaces such as `ByteArrayEncoder` and `ByteArrayDecoder` -- this enables adding new encoder or decoder methods without conflicting with the other types. This was very hard to do before because all types shared a common template such as `PlainDecoder<ByteArrayType>` * Encoder and decoder implementations are now all in an `encoding.cc` compilation unit, performance should be unchanged (I will check to make sure) * Add BYTE_ARRAY decoder methods that write into `ChunkedBinaryBuilder` or `BinaryDictionaryBuilder`. This unblocks the long-desired direct-to-categorical Parquet reads * Altered RecordReader to decode BYTE_ARRAY values directly into `ChunkedBinaryBuilder`. More work will be required to expose DictionaryArray reads in a sane way Along the way I've decided I want to eradicate all instances of `extern template class` from the codebase. It's insanely brittle with different visibility rules in MSVC, gcc, AND clang (no kidding, gcc and clang do different things). I'll refactor the others parts of the codebase that use them later Author: Wes McKinney <[email protected]> Author: Uwe L. Korn <[email protected]> Closes #3492 from wesm/PARQUET-1508 and squashes the following commits: df1bfc0 <Wes McKinney> lint f3fadcb <Uwe L. Korn> Update cpp/src/parquet/arrow/record_reader.cc 4bafc55 <Wes McKinney> Fix public compile definition on windows c4fcf74 <Wes McKinney> verbose makefile 06e2c23 <Wes McKinney> lint daac8a6 <Wes McKinney> Delete a couple commented-out methods, add code comments about unimplemented DecodeArrowNonNull method for DictionaryBuilder 453ecbd <Wes McKinney> Refactor encoder and decoder classes to facilitate type-level extensibility
- Loading branch information
Showing
42 changed files
with
1,918 additions
and
1,189 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.