Support DataFrameWriter & DataFrameReader？ #229

melin · 2023-04-19T14:55:24Z

提供的实例都是都是注册catalog，通过sql 读写ck。支持DataFrameWriter & DataFrameReader api 方式读写数据？ @pan3793

pan3793 · 2023-04-19T18:23:23Z

No, because this connector does not implement TableProvider

melin · 2023-04-20T00:49:57Z

Is there a plan to support it?

我的这个项目想通过df api 集成进来： https://github.com/melin/datatunnel

pan3793 · 2023-04-20T11:24:23Z

in low priority

durgeksh · 2023-05-25T05:29:21Z

Sorry this question is out of topic. What is the use of this connector? I want to connect Clickhouse with Spark 3.4 in Python. Do I need to use this connector? If YES then what is the class of driver.
Thank you for wonderful work!

pan3793 · 2023-05-25T05:32:21Z

@durgeksh there are quick start demos for spark-sql and spark-shell, pyspark should be similar.
https://housepower.github.io/spark-clickhouse-connector/quick_start/03_play_with_spark_shell/

durgeksh · 2023-05-25T05:58:24Z

Thanks for reply, but i see, there is no documentation for using this connector in pyspark. When I checked to use "clickhouse-jdbc-0.4.6-shaded.jar" they use driver attribute with Class of the driver as given below. How can I use this connector in pyspark?
driver = "com.clickhouse.jdbc.ClickHouseDriver"

pan3793 · 2023-05-25T06:08:43Z

If you run and compare the outputs of pyspark --help and spark-shell --help, they are similar. it should be easy to translate the spark-shell demo to the pyspark one.

I'm not a Pythoneer, so the document I write does not include the PySpark demo.

This is a connector (kind of a plugin) of Spark, a plugin does not have the responsibility to teach how to use the main framework, please read the Spark docs, it provides snippets for different languages for each case.

durgeksh · 2023-05-25T07:07:31Z

You are right. But, I am using pyspark in python program and not in shell. Thank you. How shall I use this connector in the python program?

camper42 · 2023-08-08T03:53:42Z

@durgeksh just add a clickhouse catalog, and use spark.sql()

example spark-deafult.conf in our cluster

spark.sql.catalog.ck-1 xenon.clickhouse.ClickHouseCatalog
spark.sql.catalog.ck-1.host HOST
spark.sql.catalog.ck-1.protocol http

read & write

df = spark.sql("select * from `ck-1`.db.table")
df.writeTo(`ck-1`.db.table2`).append()

durgeksh · 2023-08-10T19:45:17Z

@durgeksh just add a clickhouse catalog, and use spark.sql()

example spark-deafult.conf in our cluster
spark.sql.catalog.ck-1 xenon.clickhouse.ClickHouseCatalog
spark.sql.catalog.ck-1.host HOST
spark.sql.catalog.ck-1.protocol http
read & write
df = spark.sql("select * from `ck-1`.db.table")
df.writeTo(`ck-1`.db.table2`).append()

Thank you.

mzitnik closed this as completed Jun 30, 2024

dolfinus mentioned this issue Oct 3, 2024

Cannot use ResultSet.getObject to retrieve java.sql.Date (JDBC specification violation?) ClickHouse/clickhouse-java#1409

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support DataFrameWriter & DataFrameReader？ #229

Support DataFrameWriter & DataFrameReader？ #229

melin commented Apr 19, 2023

pan3793 commented Apr 19, 2023

melin commented Apr 20, 2023 •

edited

Loading

pan3793 commented Apr 20, 2023

durgeksh commented May 25, 2023

pan3793 commented May 25, 2023

durgeksh commented May 25, 2023 •

edited

Loading

pan3793 commented May 25, 2023

durgeksh commented May 25, 2023

camper42 commented Aug 8, 2023

durgeksh commented Aug 10, 2023

Support DataFrameWriter & DataFrameReader？ #229

Support DataFrameWriter & DataFrameReader？ #229

Comments

melin commented Apr 19, 2023

pan3793 commented Apr 19, 2023

melin commented Apr 20, 2023 • edited Loading

pan3793 commented Apr 20, 2023

durgeksh commented May 25, 2023

pan3793 commented May 25, 2023

durgeksh commented May 25, 2023 • edited Loading

pan3793 commented May 25, 2023

durgeksh commented May 25, 2023

camper42 commented Aug 8, 2023

durgeksh commented Aug 10, 2023

melin commented Apr 20, 2023 •

edited

Loading

durgeksh commented May 25, 2023 •

edited

Loading