Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
13 changes: 13 additions & 0 deletions velox/connectors/Connector.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -54,6 +54,19 @@ bool registerConnectorFactory(std::shared_ptr<ConnectorFactory> factory) {
return true;
}

bool registerConnectorFactoryAlias(
const std::string& alias,
const std::string& connectorId) {
auto factory = getConnectorFactory(connectorId);
bool ok = connectorFactories().insert({alias, factory}).second;
VELOX_CHECK(
ok,
"Alias '{}' is already registered for factory '{}'",
alias,
connectorId);
return true;
}

std::shared_ptr<ConnectorFactory> getConnectorFactory(
const std::string& connectorName) {
auto it = connectorFactories().find(connectorName);
Expand Down
7 changes: 7 additions & 0 deletions velox/connectors/Connector.h
Original file line number Diff line number Diff line change
Expand Up @@ -421,6 +421,13 @@ class ConnectorFactory {
/// FB_ANONYMOUS_VARIABLE.
bool registerConnectorFactory(std::shared_ptr<ConnectorFactory> factory);

/// Adds an alias for an existing factory. Throws if factory with the
/// connectorName is not already present. Throws if the alias for the
/// connectorName is already present.
bool registerConnectorFactoryAlias(
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering if it would be better to change ConnectorFactory API to allow for aliases. Perhaps, add

const std::vector<std::string>& aliases() const;

method.

Or, modify registerConnectorFactory API to add an optional 'aliases' parameter. That would be similar to aliases allowed when registering scalar functions.

Copy link
Copy Markdown
Collaborator Author

@majetideepak majetideepak Feb 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

| I'm wondering if it would be better to change ConnectorFactory API to allow for aliases.

@mbasmanova This was my initial implementation. But we register the Velox connectors in the Velox library.
So an application such as Prestissimo would have to manually change the Velox code to include additional aliases. For example, to add "iceberg", we have to manually modify the alias list.
I then added this API so that we can add aliases in Prestissimo without having to change Velox code.

Copy link
Copy Markdown
Collaborator Author

@majetideepak majetideepak Feb 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See the commit that is similar to your approach a30aa7f

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It sounds like Velox auto-registers connector factories and this is problematic. Let's change that to require applications to explicitly register what they need.

const std::string& alias,
const std::string& connectorName);

/// Returns a factory for creating connectors with the specified name. Throws if
/// factory doesn't exist.
std::shared_ptr<ConnectorFactory> getConnectorFactory(
Expand Down
3 changes: 1 addition & 2 deletions velox/connectors/hive/HiveConnector.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -220,6 +220,5 @@ void registerHivePartitionFunctionSerDe() {
}

VELOX_REGISTER_CONNECTOR_FACTORY(std::make_shared<HiveConnectorFactory>())
VELOX_REGISTER_CONNECTOR_FACTORY(
std::make_shared<HiveHadoop2ConnectorFactory>())

} // namespace facebook::velox::connector::hive
11 changes: 0 additions & 11 deletions velox/connectors/hive/HiveConnector.h
Original file line number Diff line number Diff line change
Expand Up @@ -83,14 +83,9 @@ class HiveConnector : public Connector {
class HiveConnectorFactory : public ConnectorFactory {
public:
static constexpr const char* FOLLY_NONNULL kHiveConnectorName = "hive";
static constexpr const char* FOLLY_NONNULL kHiveHadoop2ConnectorName =
"hive-hadoop2";

HiveConnectorFactory() : ConnectorFactory(kHiveConnectorName) {}

explicit HiveConnectorFactory(const char* FOLLY_NONNULL connectorName)
: ConnectorFactory(connectorName) {}

/// Register HiveConnector components such as Dwrf, Parquet readers and
/// writers and FileSystems.
void initialize() override;
Expand All @@ -103,12 +98,6 @@ class HiveConnectorFactory : public ConnectorFactory {
}
};

class HiveHadoop2ConnectorFactory : public HiveConnectorFactory {
public:
HiveHadoop2ConnectorFactory()
: HiveConnectorFactory(kHiveHadoop2ConnectorName) {}
};

class HivePartitionFunctionSpec : public core::PartitionFunctionSpec {
public:
HivePartitionFunctionSpec(
Expand Down
1 change: 1 addition & 0 deletions velox/connectors/hive/tests/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,7 @@ add_executable(
HivePartitionFunctionTest.cpp
FileHandleTest.cpp
HivePartitionUtilTest.cpp
HiveConnectorFactoryTest.cpp
HiveConnectorTest.cpp
HiveConnectorUtilTest.cpp
HiveConnectorSerDeTest.cpp
Expand Down
48 changes: 48 additions & 0 deletions velox/connectors/hive/tests/HiveConnectorFactoryTest.cpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
/*
* Copyright (c) Facebook, Inc. and its affiliates.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/

#include <gtest/gtest.h>

#include "velox/common/base/tests/GTestUtils.h"
#include "velox/connectors/Connector.h"

namespace facebook::velox::connector::hive {
namespace {

TEST(HiveConnectorFactoryTest, aliases) {
auto factory = connector::getConnectorFactory("hive");
ASSERT_EQ(factory->connectorName(), "hive");

registerConnectorFactoryAlias("hive-hadoop2", factory->connectorName());
registerConnectorFactoryAlias("iceberg", factory->connectorName());

ASSERT_EQ(
connector::getConnectorFactory("hive-hadoop2")->connectorName(),
factory->connectorName());
ASSERT_EQ(
connector::getConnectorFactory("iceberg")->connectorName(),
factory->connectorName());

VELOX_ASSERT_THROW(
registerConnectorFactoryAlias("iceberg", factory->connectorName()),
"Alias 'iceberg' is already registered for factory 'hive'");
VELOX_ASSERT_THROW(
registerConnectorFactoryAlias("foo", "hivefoo"),
"ConnectorFactory with name 'hivefoo' not registered");
}

} // namespace
} // namespace facebook::velox::connector::hive