Add Kafka Transport support in Trino Open Lineage Plugin#22998
Add Kafka Transport support in Trino Open Lineage Plugin#22998alprusty wants to merge 1 commit intotrinodb:masterfrom
Conversation
|
|
||
| import java.util.Properties; | ||
|
|
||
| public class OpenLineageKafkaTransport |
There was a problem hiding this comment.
Could we add test like TestOpenLineageEventListenerMarquezIntegration.java but with Kafka?
See also #22888 for similar test with Kafka cc @marton-bod
There was a problem hiding this comment.
+1 for integration tests with kafka
| <failOnWarning>false</failOnWarning> | ||
| <ignoredUnusedDeclaredDependencies> | ||
| <!-- kafka-clients is needed on class path for KafkaTransport to work in openlineage-java lib --> | ||
| <ignoredUnusedDeclaredDependency>kafka-clients:::</ignoredUnusedDeclaredDependency> |
There was a problem hiding this comment.
Can we use them as a runtime dependency explicitly ?
| private final String acks = "acks"; | ||
| private final String acksDefault = "all"; | ||
| private final String keySerializer = "key.serializer"; | ||
| private final String keySerializerDefault = "org.apache.kafka.common.serialization.StringSerializer"; | ||
| private final String valueSerializer = "value.serializer"; | ||
| private final String valueSerializerDefault = "org.apache.kafka.common.serialization.StringSerializer"; |
There was a problem hiding this comment.
How about having them as a static final variables ?
There was a problem hiding this comment.
Yup. Either static final variables or configurable via OpenLineageClientKafkaTransportConfig if (some) they need to be changed (prob not)
There was a problem hiding this comment.
I think these all producer configuration options should be configurables viaOpenLineageClientKafkaTransportConfig with said defaults. these ones are also required often:
security.protocol
ssl.keystore.location
ssl.keystore.password
ssl.truststore.location
ssl.truststore.password
There was a problem hiding this comment.
For the Config key we could use something like
kafkaProperties.put(BOOTSTRAP_SERVERS_CONFIG, brokerEndpoints);
kafkaProperties.put(ACKS_CONFIG, acksDefault);
kafkaProperties.put(KEY_SERIALIZER_CLASS_CONFIG, keySerializerDefault);
kafkaProperties.put(VALUE_SERIALIZER_CLASS_CONFIG, valueSerializerDefault);`
There was a problem hiding this comment.
In case of custom protocol - We might need to capture them as a seperate config
There was a problem hiding this comment.
Like KafkaSslConfig in Trino
There was a problem hiding this comment.
yeah but i would really avoid keeping defaults hardcoded without a way for a user to change, KafkaTransportConfig would be more flexible
| private final String acks = "acks"; | ||
| private final String acksDefault = "all"; | ||
| private final String keySerializer = "key.serializer"; | ||
| private final String keySerializerDefault = "org.apache.kafka.common.serialization.StringSerializer"; | ||
| private final String valueSerializer = "value.serializer"; | ||
| private final String valueSerializerDefault = "org.apache.kafka.common.serialization.StringSerializer"; |
There was a problem hiding this comment.
I think these all producer configuration options should be configurables viaOpenLineageClientKafkaTransportConfig with said defaults. these ones are also required often:
security.protocol
ssl.keystore.location
ssl.keystore.password
ssl.truststore.location
ssl.truststore.password
| return this; | ||
| } | ||
|
|
||
| public String getMessageKey() |
There was a problem hiding this comment.
shouldn't we have default value for this as there is no NotNull decorator?
|
|
||
| import java.util.Properties; | ||
|
|
||
| public class OpenLineageKafkaTransport |
There was a problem hiding this comment.
+1 for integration tests with kafka
| import io.airlift.configuration.ConfigDescription; | ||
| import jakarta.validation.constraints.NotNull; | ||
|
|
||
| public class OpenLineageClientKafkaTransportConfig |
There was a problem hiding this comment.
can you add corresponding TestOpenLineageClientKafkaTransportConfig to tests?
|
|
||
| @Config("openlineage-event-listener.transport.kafka.topic") | ||
| @ConfigDescription("String specifying the topic to which events will be sent") | ||
| public OpenLineageClientKafkaTransportConfig setTopicName(String topicName) |
There was a problem hiding this comment.
if the config is openlineage-event-listener.transport.kafka.topic can we unify this and use topic here instead of topicName?
| <artifactId>jakarta.validation-api</artifactId> | ||
| </dependency> | ||
|
|
||
| <dependency> |
There was a problem hiding this comment.
Add Kafka as one of the OL transport types
Please avoid using abbreviations.
| } | ||
|
|
||
| @NotNull | ||
| public String getBrokerEndpoints() |
There was a problem hiding this comment.
Please replace getter/setter orders for messageKey & brokerEndpoints for consistency with fields.
| return topicName; | ||
| } | ||
|
|
||
| @Config("openlineage-event-listener.transport.kafka.topic") |
There was a problem hiding this comment.
I would recommend renaming transport.kafka to kafka-transport. Same for others.
| @@ -0,0 +1,66 @@ | |||
| /* | |||
There was a problem hiding this comment.
Please add a new test to TestOpenLineagePlugin with Kafka transport.
| CONSOLE, | ||
| HTTP, | ||
| /**/ | ||
| KAFKA |
There was a problem hiding this comment.
| KAFKA | |
| KAFKA, | |
| /**/ |
| * See the License for the specific language governing permissions and | ||
| * limitations under the License. | ||
| */ | ||
| package io.trino.plugin.openlineage.config.kafka; |
There was a problem hiding this comment.
Please use io.trino.plugin.openlineage.transport.kafka package instead. We usually don't add a new package just for config classes.
|
This pull request has gone a while without any activity. Tagging the Trino developer relations team: @bitsondatadev @colebow @mosabua |
|
Thanks @mosabua for following up here. |
|
This pull request has gone a while without any activity. Tagging the Trino developer relations team: @bitsondatadev @colebow @mosabua |
|
Closing this pull request, as it has been stale for six weeks. Feel free to re-open at any time. |
|
I am reopening and addig stale-ignore label under the assumption that you will continue this work @alprusty |
|
This pull request has gone a while without any activity. Tagging for triage help: @mosabua |
|
Closing as a stale PR. Please feel free to reopen if you continue the work. |
Description
Currently Trino open lineage plugin supports Console and Http transport. This change is adding a Kafka transport support.
Additional context and related issues
Fixes #21599
Release notes
( ) This is not user-visible or is docs only, and no release notes are required.
( ) Release notes are required. Please propose a release note for me.
( ) Release notes are required, with the following suggested text: