-
Notifications
You must be signed in to change notification settings - Fork 227
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support for ODBC driver connection type #116
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is working for me locally, which is very exciting. Tiny comment for now around the nomenclature of the new connection endpoint.
dbt/adapters/spark/connections.py
Outdated
class SparkClusterType(StrEnum): | ||
ALL_PURPOSE = "all-purpose" | ||
VIRTUAL = "virtual" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Databricks isn't using the name "virtual clusters" anymore, I believe they're just calling them "endpoints."
Rather than a combination of cluster
and cluster_type
, I think we should make the distinction between cluster
(old style, all-purpose/interactive) and endpoint
(new style). Users should specify either a cluster
or an endpoint
when connecting via odbc
.
47381a5
to
9366d2e
Compare
1e84db3
to
cfd804b
Compare
cfd804b
to
a18be08
Compare
bb62760
to
165d83b
Compare
f"{self.method} connection method requires " | ||
"additional dependencies. \n" | ||
"Install the additional required dependencies with " | ||
"`pip install dbt-spark[ODBC]`" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I landed on pip install dbt-spark[ODBC]
because pyodbc
is the only python dep that I imagine will require OS dependencies. Also, I think this should line up with connection methods (thrift
, http
, odbc
), rather than connection locations (Databricks, etc..)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I buy it!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Eventually, I think we might want to try moving PyHive[hive]
as an extra instead of principal requirement, since it's only necessary for the http
connection method.
Justification: we see some installation errors (e.g. #114) resulting from less-maintained dependencies exclusive to PyHive
Not something we need to do right now!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks great! Thanks for the excellent, wide-reaching work to get this running with automated tests.
f"{self.method} connection method requires " | ||
"additional dependencies. \n" | ||
"Install the additional required dependencies with " | ||
"`pip install dbt-spark[ODBC]`" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I buy it!
f"{self.method} connection method requires " | ||
"additional dependencies. \n" | ||
"Install the additional required dependencies with " | ||
"`pip install dbt-spark[ODBC]`" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Eventually, I think we might want to try moving PyHive[hive]
as an extra instead of principal requirement, since it's only necessary for the http
connection method.
Justification: we see some installation errors (e.g. #114) resulting from less-maintained dependencies exclusive to PyHive
Not something we need to do right now!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few tiny notes after rechecking readme
FYI removing myself from review, but this looks |
Co-authored-by: Jeremy Cohen <[email protected]>
Resolves #104
Description
Add support for ODBC driver connections with Databricks via cluster specific path and sql endpoint path.
pyodbc
for using ODBC driver for connectiondriver
andendpoint
to profile config (cluster
andendpoint
are mutually exclusive, they determine how to connect to Databricks)Note
At the time of writing this, the new SQL Endpoint does not support
create temporary view
. Also,extended
is prohibited by the ODBC driver for some operations dbt-labs/dbt-adapter-tests#8