-
Notifications
You must be signed in to change notification settings - Fork 1.5k
refactor: Make register readers and writers file format agnostic #14090
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,46 @@ | ||
| /* | ||
| * Copyright (c) Facebook, Inc. and its affiliates. | ||
| * | ||
| * Licensed under the Apache License, Version 2.0 (the "License"); | ||
| * you may not use this file except in compliance with the License. | ||
| * You may obtain a copy of the License at | ||
| * | ||
| * http://www.apache.org/licenses/LICENSE-2.0 | ||
| * | ||
| * Unless required by applicable law or agreed to in writing, software | ||
| * distributed under the License is distributed on an "AS IS" BASIS, | ||
| * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
| * See the License for the specific language governing permissions and | ||
| * limitations under the License. | ||
| */ | ||
|
|
||
| #include "velox/dwio/RegisterReaders.h" | ||
|
|
||
| #include "velox/dwio/dwrf/RegisterDwrfReader.h" | ||
| #include "velox/dwio/orc/reader/OrcReader.h" | ||
| #include "velox/dwio/text/RegisterTextReader.h" | ||
| #ifdef VELOX_ENABLE_PARQUET | ||
| #include "velox/dwio/parquet/RegisterParquetReader.h" | ||
| #endif | ||
|
|
||
| namespace facebook::velox::dwio { | ||
|
|
||
| void registerReaderFactories() { | ||
| dwrf::registerDwrfReaderFactory(); | ||
| orc::registerOrcReaderFactory(); | ||
| text::registerTextReaderFactory(); | ||
| #ifdef VELOX_ENABLE_PARQUET | ||
| parquet::registerParquetReaderFactory(); | ||
| #endif | ||
| } | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Would you mind sharing the specific issue you're experiencing with the separate registrations? I assume Spark SQL typically does not register the DWRF readers and writers by default, and prior to this change, it was possible to register only the necessary components as needed.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @rui-mo Thanks for reviewing! There are a few motivations:
That way we don't have to introduce new tests specifically for Parquet or new file format, but just turning the tests to TEST_P tests that can automatically iterate through the file format collection.
I added a paragraph in the design doc and the PR message to explain why this is a good thing: #14514
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @rui-mo Also, the single file format registry is still supported. So there is no backward compatibility issue for Gluten.
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thanks for explaining the benefits. Regarding the first point, I find |
||
|
|
||
| void unregisterReaderFactories() { | ||
| dwrf::unregisterDwrfReaderFactory(); | ||
| orc::unregisterOrcReaderFactory(); | ||
| text::unregisterTextReaderFactory(); | ||
| #ifdef VELOX_ENABLE_PARQUET | ||
| parquet::unregisterParquetReaderFactory(); | ||
| #endif | ||
| } | ||
|
|
||
| } // namespace facebook::velox::dwio | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,25 @@ | ||
| /* | ||
| * Copyright (c) Facebook, Inc. and its affiliates. | ||
| * | ||
| * Licensed under the Apache License, Version 2.0 (the "License"); | ||
| * you may not use this file except in compliance with the License. | ||
| * You may obtain a copy of the License at | ||
| * | ||
| * http://www.apache.org/licenses/LICENSE-2.0 | ||
| * | ||
| * Unless required by applicable law or agreed to in writing, software | ||
| * distributed under the License is distributed on an "AS IS" BASIS, | ||
| * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
| * See the License for the specific language governing permissions and | ||
| * limitations under the License. | ||
| */ | ||
|
|
||
| #pragma once | ||
|
|
||
| namespace facebook::velox::dwio { | ||
|
|
||
| void registerReaderFactories(); | ||
|
|
||
| void unregisterReaderFactories(); | ||
|
|
||
| } // namespace facebook::velox::dwio |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do you force it to be static?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@assignUser Thanks for reviewing! I recall that it forces CMake to build libvelox_dwio_registered_readers.a and help Linux system to build correctly. It is not an umbrella target (purely transitive deps, no actual objects) where "INTERFACE" is usually for. It actually does have code in RegisterReaders.cpp, so it would be either STATIC, OBJECT or SHARED. Since we want outside users to link with it so it cannot be "OBJECT" here. It is manual registration, and it is never supposed to be loaded dynamically on its own, so it won't be "SHARED" either. I believe we are not doing dynamic loading anyways, so by default it is still "STATIC", I just explicitly specified it out to make it clear.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, not sure how I missed this before but this should be part of the main library as it's not a utility for tests or fuzzers, so please use our wrapper functions:
velox_add_librarywithout any library type keywordvelox_link_librariesinstead oftarget_link_libraries.Sorry, this isn't well documented yet (#14941).