Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
71 changes: 71 additions & 0 deletions python/docs/source/development/testing.rst
Original file line number Diff line number Diff line change
Expand Up @@ -55,3 +55,74 @@ Running tests using GitHub Actions
You can run the full PySpark tests by using GitHub Actions in your own forked GitHub
repository with a few clicks. Please refer to
`Running tests in your forked repository using GitHub Actions <https://spark.apache.org/developer-tools.html>`_ for more details.


===================================
Testing Spark Connect Python Client
===================================

**Spark Connect is a strictly experimental feature and under heavy development.
All APIs should be considered volatile and should not be used in production.**

This module contains the implementation of Spark Connect which is a logical plan
facade for the implementation in Spark. Spark Connect is directly integrated into the build
of Spark. To enable it, you only need to activate the driver plugin for Spark Connect.


Build
-----

.. code-block:: bash

./build/mvn -Phive clean package

or

.. code-block:: bash

./build/sbt -Phive clean package


Run Spark Shell
---------------

To run Spark Connect you locally built:

.. code-block:: bash

# Scala shell
./bin/spark-shell \
--jars `ls connector/connect/target/**/spark-connect*SNAPSHOT.jar | paste -sd ',' -` \
--conf spark.plugins=org.apache.spark.sql.connect.SparkConnectPlugin

.. code-block:: bash

# PySpark shell
./bin/pyspark \
--jars `ls connector/connect/target/**/spark-connect*SNAPSHOT.jar | paste -sd ',' -` \
--conf spark.plugins=org.apache.spark.sql.connect.SparkConnectPlugin

To use the release version of Spark Connect:

.. code-block:: bash

./bin/spark-shell \
--packages org.apache.spark:spark-connect_2.12:3.4.0 \
--conf spark.plugins=org.apache.spark.sql.connect.SparkConnectPlugin


Run Tests
---------

.. code-block:: bash

./python/run-tests --testnames 'pyspark.sql.tests.connect.test_connect_basic'


Generate proto generated files for the Python client
----------------------------------------------------

1. Install `buf version 1.8.0`: https://docs.buf.build/installation
2. Run `pip install grpcio==1.48.1 protobuf==4.21.6 mypy-protobuf==3.3.0`
3. Run `./connector/connect/dev/generate_protos.sh`
4. Optional Check `./dev/check-codegen-python.py`
56 changes: 0 additions & 56 deletions python/pyspark/sql/connect/README.md

This file was deleted.