-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Investigate dynamic JDBC driver provisioning #391
Comments
@razvan Also custom image with provided postgresql jar didnt help...
And I see this in SparkUI But still getting
|
@razvan May be its because my curl is on top of entrypoint layer? I mean may be it should be on the same level like here https://github.com/stackabletech/docker-images/blob/main/spark-k8s/Dockerfile#L67? |
Is the spec:
sparkImage:
custom: <your image name>
productVersion: 3.5.1
pullPolicy: IfNotPresent |
Of course. |
|
Update: I had some leftover deps in the spec. Removed them and it still works:
This works for me with an image I built from your ---
apiVersion: v1
kind: ConfigMap
metadata:
name: write-to-postgresql
data:
write-to-postgresql.py: |
from pyspark.sql import SparkSession
from pyspark.sql.types import *
spark = SparkSession.builder.appName("write-to-postgresql").getOrCreate()
df = spark.createDataFrame([1,2,3], IntegerType())
# Specifying create table column data types on write
df.write \
.option("createTableColumnTypes", "value INTEGER") \
.jdbc("jdbc:postgresql://spark-postgresql/spark", "sparktest",
properties={"user": "spark", "password": "spark"})
---
apiVersion: spark.stackable.tech/v1alpha1
kind: SparkApplication
metadata:
name: spark-postgresql
spec:
version: "1.0"
sparkImage:
imagePullPolicy: IfNotPresent
custom: "docker.stackable.tech/sandbox/spark-k8s:3.5.1-stackable24.3.0-postgresql"
productVersion: "3.5.1"
mode: cluster
mainApplicationFile: "local:///stackable/spark/jobs/write-to-postgresql.py"
job:
logging:
enableVectorAgent: False
config:
volumeMounts:
- name: script
mountPath: /stackable/spark/jobs
driver:
logging:
enableVectorAgent: False
config:
volumeMounts:
- name: script
mountPath: /stackable/spark/jobs
executor:
logging:
enableVectorAgent: False
config:
volumeMounts:
- name: script
mountPath: /stackable/spark/jobs
volumes:
- name: script
configMap:
name: write-to-postgresql
|
yeah it was my fault. my connection string was just |
@maltesander I think its still not completed. Because providing - org.postgresql:postgresql:42.6.0 in packages still not working with corrected jdbc scheme |
Sorry, reopening then :) |
And trying to use JDBC catalog for Iceberg. I need PostgreSQL driver for that.
What I did first:
After than in enviroment tab of SparkUI I can see postgresql driver. -
spark://pyspark-pi-testtesttest-10f9ee8ef5283865-driver-svc.spark.svc:7078/files/org.postgresql_postgresql-42.6.0.jar Added By User
But I got this exception:
Also some of my settings:
spark.driver.userClassPathFirst: 'false'
spark.executor.userClassPathFirst: 'false'
without that other packages is not working and trigger strange errors.
Also providing this jar under PVC is not helping. I can see it in SparkUI in classpath but still getting this error
Originally posted by @supsupsap in #245 (comment)
The text was updated successfully, but these errors were encountered: