Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support Delta connector #538

Open
ruslanguns opened this issue Feb 12, 2024 · 2 comments
Open

Support Delta connector #538

ruslanguns opened this issue Feb 12, 2024 · 2 comments

Comments

@ruslanguns
Copy link

Docs: https://trino.io/docs/current/connector/delta-lake.html

I was able to setup a delta connector with the following workaround 👇🏻 :

  1. Established a TrinoCatalog using Hive as the connector to add S3 connection capabilities, necessary for accessing Delta tables stored in S3.
apiVersion: trino.stackable.tech/v1alpha1
kind: TrinoCatalog
metadata:
  name: hive
  namespace: infrastructure
  labels:
    trino: trino
spec:
  connector:
    hive:
      metastore:
        configMap: hive
      s3:
        reference: spark-s3-connection
  configOverrides:
    hive.metastore.username: delta

This allows me to add the s3 connection capabilities.

  1. Configured a ConfigMap named trino-delta-properties with settings for the Delta connector. This configuration mirrors the Hive setup but excludes hive.security=allow-all due to compatibility issues. It also introduces a Delta-specific setting to enable table registration from S3 storage.
apiVersion: v1
kind: ConfigMap
metadata:
  name: trino-delta-properties
  namespace: infrastructure
data:
  delta.properties: |
    connector.name=delta_lake
    hive.metastore.uri=thrift://hive.default.svc.cluster.local:9083
    hive.metastore.username=delta
    hive.s3.aws-access-key=${ENV:CATALOG_HIVE_HIVE_S3_AWS_ACCESS_KEY}
    hive.s3.aws-secret-key=${ENV:CATALOG_HIVE_HIVE_S3_AWS_SECRET_KEY}
    hive.s3.endpoint=http://s3
    hive.s3.path-style-access=true
    hive.s3.ssl.enabled=false
    delta.register-table-procedure.enabled=true

this configuration has an additional parameter delta.register-table-procedure.enabled=true as it'srequired in order to register delta tables that are already in the s3 object storage.

  1. Adjusted podOverrides for both worker and coordinator to mount the ConfigMap. However, this method replaces the entire /stackable/config/catalog directory, inadvertently removing the hive.properties file generated by TrinoCatalog. While sufficient for some scenarios, this approach may not cater to all use cases due to its overriding behavior.

Alternatively use initContainers to inject the Delta connector properties into the existing configuration directory without overwriting existing configurations.

podOverrides:
     spec:
        containers:
        - name: trino
           volumeMounts:
           - name: delta-properties
             mountPath: /stackable/config/catalog
        volumes:
        - name: access-control-properties
          configMap:
          - name: delta-properties
             configMap:
               name: trino-delta-properties
    roleGroups:
      default:
        replicas: 1
@sbernauer
Copy link
Member

sbernauer commented Feb 14, 2024

Hi @ruslanguns! Thanks for posting your workaround. There might have been an easier way to achieve this, but I have good news, as we implemented #352 2 weeks ago.
So Delta is supported by our currently nightly version and will be in the next 24.3 release :)
Please feel free to give it a try and give feedback if it works as desired!

@ruslanguns
Copy link
Author

I will definitely do it...I'm scheduling it for next week. Thanks for the response. We could close the ticket if you want.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants