misc(native): Replace HiveConnectorFactory with IcebergConnectorFactory in iceberg connector#26661
Conversation
Reviewer's guide (collapsed on small PRs)Reviewer's GuideRegisters the dedicated Iceberg connector factory in native execution and updates native query runner test catalog configuration to use the Iceberg connector name for the iceberg catalog instead of always using the Hive connector name. Sequence diagram for iceberg connector usage in native executionsequenceDiagram
actor User
participant Coordinator
participant NativeWorker
participant ConnectorRegistry
participant IcebergFactory
participant IcebergConnector
Note over NativeWorker,ConnectorRegistry: Startup phase
NativeWorker->>ConnectorRegistry: registerConnectorFactories
NativeWorker->>ConnectorRegistry: register IcebergConnectorFactory
ConnectorRegistry->>IcebergFactory: store factory for connector_name iceberg
Note over User,IcebergConnector: Query execution
User->>Coordinator: submit query using catalog iceberg
Coordinator->>NativeWorker: schedule native task for catalog iceberg
NativeWorker->>ConnectorRegistry: getFactory(connector_name iceberg)
ConnectorRegistry-->>NativeWorker: IcebergConnectorFactory
NativeWorker->>IcebergFactory: create(connector_id, config)
IcebergFactory-->>NativeWorker: IcebergConnector
NativeWorker->>IcebergConnector: plan and execute reads/writes
IcebergConnector-->>NativeWorker: data pages
NativeWorker-->>Coordinator: query results
Coordinator-->>User: final result set
Class diagram for updated iceberg connector factory registrationclassDiagram
class ConnectorFactory {
<<interface>>
+string getName()
+Connector create(string connectorId, map configuration)
}
class HiveConnectorFactory {
+string name
+HiveConnectorFactory()
+string getName()
+Connector create(string connectorId, map configuration)
}
class IcebergConnectorFactory {
+string name
+IcebergConnectorFactory()
+string getName()
+Connector create(string connectorId, map configuration)
}
class IcebergConnector {
+string connectorId
+IcebergConnector(string connectorId, map configuration)
+SplitManager getSplitManager()
+PageSourceProvider getPageSourceProvider()
}
class Registration {
+void registerConnectorFactories()
}
class ConnectorRegistry {
+void registerFactory(ConnectorFactory factory)
+ConnectorFactory getFactory(string connectorName)
}
ConnectorFactory <|.. HiveConnectorFactory
ConnectorFactory <|.. IcebergConnectorFactory
IcebergConnectorFactory --> IcebergConnector
Registration --> ConnectorRegistry : calls
Registration --> IcebergConnectorFactory : creates and registers
Registration --> HiveConnectorFactory : creates and registers
ConnectorRegistry --> ConnectorFactory : stores
ConnectorRegistry --> IcebergConnectorFactory : returns for iceberg
ConnectorRegistry --> HiveConnectorFactory : returns for hive
File-Level Changes
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
cb839da to
c202d2b
Compare
There was a problem hiding this comment.
Hey there - I've reviewed your changes - here's some feedback:
- In
Registration.cpp, theIcebergConnectorFactoryis now registered without explicitly usingkIcebergConnectorName; consider ensuring that the factory’s internal name matches this constant (or wiring the constant through if supported) so that existing catalog configurations relying on that name remain consistent. - In
PrestoNativeQueryRunnerUtils, the connector name for Iceberg is hard-coded as the string "iceberg"; consider referencing a shared constant (e.g., the same value askIcebergConnectorName) to avoid subtle drift between test setup and production connector registration.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- In `Registration.cpp`, the `IcebergConnectorFactory` is now registered without explicitly using `kIcebergConnectorName`; consider ensuring that the factory’s internal name matches this constant (or wiring the constant through if supported) so that existing catalog configurations relying on that name remain consistent.
- In `PrestoNativeQueryRunnerUtils`, the connector name for Iceberg is hard-coded as the string "iceberg"; consider referencing a shared constant (e.g., the same value as `kIcebergConnectorName`) to avoid subtle drift between test setup and production connector registration.Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
c202d2b to
cdb113e
Compare
|
@PingLiuPing : Thanks for this code. The code looks good, but you will need to fix all the errors from Advancing Velox. PTAL. |
Thanks, I checked the failure pipeline. We should wait facebookincubator/velox#15658 to get merged first. Then the CI should pass. |
BryanCutler
left a comment
There was a problem hiding this comment.
Looks good pending tests, just one suggestion about the testing worker launcher
|
|
||
| Path catalogDirectoryPath = tempDirectoryPath.resolve("catalog"); | ||
| Files.createDirectory(catalogDirectoryPath); | ||
| String connectorName = catalogName.equals("iceberg") ? "iceberg" : "hive"; |
There was a problem hiding this comment.
I think it would be clearer to add another argument for connectorName when calling getExternalWorkerLauncher
There was a problem hiding this comment.
Thank you @BryanCutler , by checking the source further I think we can just use catalogName as the connector name.
There was a problem hiding this comment.
Yes, I could see catalogName is the same in this case, I just meant it's not very clear until you go through the source and see what values are used for the test catalogs.
There was a problem hiding this comment.
@BryanCutler Thanks, I added connectorName. Would you have a look?
e7794d3 to
ac7417e
Compare
hantangwangd
left a comment
There was a problem hiding this comment.
@PingLiuPing Thanks for this change. As I see, writing data to an Iceberg table would fail without this change if we update to the latest velox version, is that right? Just a bit curious why the TestUnpartitionedWrite isn't failing in the CI tests in other PRs. Do you have any insight on this?
Thanks, let me have a look. |
|
@hantangwangd I just verified that without this PR, |
ac7417e to
05fad56
Compare
|
@PingLiuPing Thanks for the explanation. It's a little curious to me that the CI tests didn't catch a failure in Should we add these test classes to our CI flow? Or do you have any relevant context on this? |
|
@hantangwangd Thanks. Should I add those tests to CI in this PR or do you prefer add them to CI in a separate PR? |
|
@PingLiuPing If you don't mind, I think it would be better to add them directly in this PR. |
@hantangwangd I have added those tests to CI. |
|
The error relates to disk space. Seems #26876 can resolve the issue. |
938b581 to
1fb63db
Compare
hantangwangd
left a comment
There was a problem hiding this comment.
@PingLiuPing thanks for the fix, lgtm!
Description
Use iceberg connector factory for iceberg connector.
Change Iceberg NativeQueryRunner catalog property.
Motivation and Context
Impact
Test Plan
Contributor checklist
Release Notes
Please follow release notes guidelines and fill in the release notes below.