This is a modified version of our public tutorial intended for users of dbt on Databricks.
Any questions? [email protected]
Create Databricks tables jaffle_shop.orders
, jaffle_shop.customers
,
and stripe.payments
from these CSV files, which are located in a public S3 bucket (docs):
s3://dbt-tutorial-public/jaffle_shop_orders.csv
s3://dbt-tutorial-public/jaffle_shop_customers.csv
s3://dbt-tutorial-public/stripe_payments.csv
The instructions below assume you are running dbt on macOS. Linux and Windows users should adjust the bash commands accordingly.
- Clone this github repo
- Install dbt-spark:
pip install dbt-spark
- Copy the example profile to your
~/.dbt
folder (created when installing dbt):
$ cp ./sample.profiles.yml ~/.dbt/profiles.yml
- Populate
~/.dbt/profiles.yml
with your Databricks host, API token, cluster ID, and schema name
open ~/.dbt
- Verify that you can connect to Databricks
$ dbt debug
- Verify that you can run dbt
$ dbt run
- Learn more about dbt in the docs
- Check out Discourse for commonly asked questions and answers
- Join the chat on Slack for live discussions and support
- Find dbt events near you
- Check out the blog for the latest news on dbt's development and best practices
- Watch our Office Hours on dbt + Spark