Skip to content

Rename integrations to resources on the SDK and example notebooks #1276

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 9 commits into from
May 3, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 6 additions & 8 deletions examples/churn_prediction/Customer Churn Prediction.ipynb
Original file line number Diff line number Diff line change
@@ -1,7 +1,6 @@
{
"cells": [
{
"attachments": {},
"cell_type": "markdown",
"id": "3637c5d6-6d94-4db5-ba31-5cea5f80b5b3",
"metadata": {},
Expand Down Expand Up @@ -779,7 +778,7 @@
"source": [
"### Connecting to Your Data\n",
"\n",
"Workflows access and publish to data integrations that are configured on the Integrations Page. In this demo we will connect to the demo data integration which is a Postgres database containing several standard [sample datasets](https://docs.aqueducthq.com/example-workflows/demo-data-warehouse) including some synthetic customer churn data. Each kind of data integration may offer different functionality. Here we are using a relational integration which support general SQL expressions.\n"
"Workflows access and publish to data resource that are configured on the Resources Page. In this demo we will connect to the demo data resource which is a Postgres database containing several standard [sample datasets](https://docs.aqueducthq.com/example-workflows/demo-data-warehouse) including some synthetic customer churn data. Each kind of data resource may offer different functionality. Here we are using a relational resource which support general SQL expressions.\n"
]
},
{
Expand All @@ -797,7 +796,7 @@
}
],
"source": [
"warehouse = client.integration(name=\"aqueduct_demo\")\n",
"warehouse = client.resource(name=\"aqueduct_demo\")\n",
"\n",
"# customers_table is an Aqueduct TableArtifact, which is a wrapper around\n",
"# a Pandas DataFrame. A TableArtifact can be used as argument to any operator\n",
Expand Down Expand Up @@ -1505,9 +1504,9 @@
"id": "1d92df34-608b-401d-aff7-f7cb8a9a4bc7",
"metadata": {},
"source": [
"### Saving Tables Artifacts to Integrations\n",
"### Saving Tables Artifacts to Data Resources\n",
"\n",
"So far we have defined a workflow to build the `churn_table` containing our churn predictions. We now want to publish this table where others can use it. We do this by `saving` the table to various integrations. \n",
"So far we have defined a workflow to build the `churn_table` containing our churn predictions. We now want to publish this table where others can use it. We do this by `saving` the table to various resources.\n",
"\n",
"First we save the table back to the data warehouse that contains the original customer data. "
]
Expand All @@ -1528,7 +1527,6 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "e20c2fbe-e201-4a92-a50f-8d82aa771c19",
"metadata": {
Expand Down Expand Up @@ -1582,7 +1580,7 @@
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
Expand All @@ -1596,7 +1594,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.8.2"
"version": "3.8.13"
},
"vscode": {
"interpreter": {
Expand Down
6 changes: 3 additions & 3 deletions examples/diabetes-classifier/Classifying Diabetes Risk.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -1169,7 +1169,7 @@
}
],
"source": [
"demodb = client.integration(\"aqueduct_demo\")\n",
"demodb = client.resource(\"aqueduct_demo\")\n",
"\n",
"# mpg_data is an Aqueduct TableArtifact, which is a wrapper around\n",
"# a Pandas DataFrame. A TableArtifact can be used as argument to any operator\n",
Expand Down Expand Up @@ -1327,7 +1327,7 @@
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
Expand All @@ -1341,7 +1341,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.8.2"
"version": "3.8.13"
},
"vscode": {
"interpreter": {
Expand Down
8 changes: 4 additions & 4 deletions examples/finetune-resnet-50/FineTune ResNet-50.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -26,13 +26,13 @@
"import aqueduct as aq\n",
"from aqueduct import op, metric\n",
"\n",
"K8S_INTEGRATION_NAME = 'eks-us-east-2' # REPLACE ME!\n",
"K8S_RESOURCE_NAME = 'eks-us-east-2' # REPLACE ME!\n",
"\n",
"client = aq.Client()\n",
"\n",
"# This line sets Aqueduct to run in lazy mode, since some of our compute can be expensive,\n",
"# and it sets all functions to run on the EKS cluster we've connected to.\n",
"aq.global_config({\"lazy\": True, \"engine\": K8S_INTEGRATION_NAME})"
"aq.global_config({\"lazy\": True, \"engine\": K8S_RESOURCE_NAME})"
]
},
{
Expand All @@ -53,7 +53,7 @@
"DATASET_BUCKET_NAME = 'datasets' # REPLACE ME!\n",
"DATASET_PATH = 'resnet-data/resnet.zip' # REPLACE ME!\n",
"\n",
"datasets = client.integration(DATASET_BUCKET_NAME)\n",
"datasets = client.resource(DATASET_BUCKET_NAME)\n",
"\n",
"# Due to the way S3 works, it's more efficient for us to load a large zipped file \n",
"# rather than many small files. As a result, we load a zipfile.\n",
Expand Down Expand Up @@ -397,7 +397,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.9"
"version": "3.8.13"
}
},
"nbformat": 4,
Expand Down
9 changes: 4 additions & 5 deletions examples/house-price-prediction/House Price Prediction.ipynb
Original file line number Diff line number Diff line change
@@ -1,7 +1,6 @@
{
"cells": [
{
"attachments": {},
"cell_type": "markdown",
"id": "a047703f",
"metadata": {},
Expand Down Expand Up @@ -918,7 +917,7 @@
}
],
"source": [
"demo_db = client.integration(\"aqueduct_demo\")\n",
"demo_db = client.resource(\"aqueduct_demo\")\n",
"raw_data = demo_db.sql(\"select * from house_prices;\")\n",
"\n",
"filled_data = fill_missing_data(raw_data)\n",
Expand Down Expand Up @@ -991,7 +990,7 @@
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
Expand All @@ -1005,7 +1004,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.8.2"
"version": "3.8.13"
},
"vscode": {
"interpreter": {
Expand All @@ -1015,4 +1014,4 @@
},
"nbformat": 4,
"nbformat_minor": 5
}
}
8 changes: 4 additions & 4 deletions examples/mpg-regressor/Predicting MPG.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -52,7 +52,7 @@
"id": "3ec5e482",
"metadata": {},
"source": [
"Once we have our client, the first thing we'll do is load our data. Aqueduct has the ability to most common databases and storage systems (check out the Integrations page on the Aqueduct UI). Here, we'll load a connection to the default `aqueduct_demo` database, which comes preloaded with a number of [common datasets](https://docs.aqueducthq.com/example-workflows/demo-data-warehouse). \n",
"Once we have our client, the first thing we'll do is load our data. Aqueduct has the ability to most common databases and storage systems (check out the Resources page on the Aqueduct UI). Here, we'll load a connection to the default `aqueduct_demo` database, which comes preloaded with a number of [common datasets](https://docs.aqueducthq.com/example-workflows/demo-data-warehouse). \n",
"\n",
"Once we have a connection to the demo DB, we can run a SQL query to retrieve our base data."
]
Expand All @@ -73,7 +73,7 @@
}
],
"source": [
"demodb = client.integration(\"aqueduct_demo\")\n",
"demodb = client.resource(\"aqueduct_demo\")\n",
"\n",
"# mpg_data is an Aqueduct TableArtifact, which is a wrapper around\n",
"# a Pandas DataFrame. A TableArtifact can be used as argument to any operator\n",
Expand Down Expand Up @@ -1976,7 +1976,7 @@
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
Expand All @@ -1990,7 +1990,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.8.2"
"version": "3.8.13"
},
"vscode": {
"interpreter": {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -33,7 +33,7 @@
"metadata": {},
"outputs": [],
"source": [
"warehouse = client.integration(\"postgres_integration\")"
"warehouse = client.resource(\"postgres_integration\")"
]
},
{
Expand Down
8 changes: 4 additions & 4 deletions examples/roberta-sentiment-analysis/RoBERTa Sentiment.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,7 @@
"\n",
"client = aq.Client()\n",
"\n",
"# This config tells Aqueduct to run every operator on the integration named \"eks-us-east-2\".\n",
"# This config tells Aqueduct to run every operator on the resource named \"eks-us-east-2\".\n",
"# It also activates \"lazy\" mode, meaning that we will only trigger compute operations\n",
"# when data is requested since some of the functions below can be expensive.\n",
"aq.global_config({\"lazy\": True, 'engine': 'eks-us-east-2'})"
Expand Down Expand Up @@ -116,7 +116,7 @@
"source": [
"# Load the data from the S3 bucket and see a preview of the table.\n",
"# This is about ~100MB of data and takes about ~10s to load.\n",
"datasets_bucket = client.integration('datasets')\n",
"datasets_bucket = client.resource('datasets')\n",
"tweets = datasets_bucket.file('got_s8_tweets.csv', artifact_type=\"table\", format=\"csv\")\n",
"\n",
"tweets.get().head()"
Expand All @@ -134,7 +134,7 @@
"We first create batches of 1K tweets to tokenize at a time. To keep our tensors uniform, we pad them with 0s as necessary and then concatenate them. \n",
"\n",
"The `@op` decorator here has a few configuration parameters:\n",
"* First, the `engine` parameter tells us that we're going to be running on our EKS cluster in `us-east-2`; you can see the configuration for this integration on the Aqueduct UI. \n",
"* First, the `engine` parameter tells us that we're going to be running on our EKS cluster in `us-east-2`; you can see the configuration for this resource on the Aqueduct UI. \n",
"* Second, we specify the requirements needed to run this function (`torch` and `transformers`); if necessary, we could specify the required versions as well. \n",
"* Finally, we tell Aqueduct to give this container 15GB of RAM."
]
Expand Down Expand Up @@ -1058,7 +1058,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.9"
"version": "3.8.13"
}
},
"nbformat": 4,
Expand Down
2 changes: 1 addition & 1 deletion examples/sentiment-analysis/Sentiment Model.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -78,7 +78,7 @@
"metadata": {},
"outputs": [],
"source": [
"warehouse = client.integration(\"aqueduct_demo\")\n",
"warehouse = client.resource(\"aqueduct_demo\")\n",
"\n",
"# reviews_table is an Aqueduct TableArtifact, which is a wrapper around\n",
"# a Pandas DataFrame. A TableArtifact can be used as argument to any operator\n",
Expand Down
2 changes: 1 addition & 1 deletion examples/tutorials/Parameters Tutorial.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -58,7 +58,7 @@
"metadata": {},
"outputs": [],
"source": [
"db = client.integration(\"aqueduct_demo\")\n",
"db = client.resource(\"aqueduct_demo\")\n",
"\n",
"# reviews_table is an Aqueduct TableArtifact, which is a wrapper around\n",
"# a Pandas DataFrame. A TableArtifact can be used as argument to any operator\n",
Expand Down
14 changes: 11 additions & 3 deletions examples/tutorials/Quickstart Tutorial.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -222,7 +222,7 @@
}
],
"source": [
"demo_db = client.integration(\"aqueduct_demo\")\n",
"demo_db = client.resource(\"aqueduct_demo\")\n",
"reviews_table = demo_db.sql(\"select * from hotel_reviews;\")\n",
"\n",
"# You will see the type of `reviews_table` is an Aqueduct TableArtifact.\n",
Expand Down Expand Up @@ -529,7 +529,7 @@
"\n",
"---\n",
"### Saving Data\n",
"Finally, we can save the transformed table `strlen_table` back to the Aqueduct demo database. See [here](https://docs.aqueducthq.com/integrations/using-integrations) for more details around using integration objects."
"Finally, we can save the transformed table `strlen_table` back to the Aqueduct demo database. See [here](https://docs.aqueducthq.com/integrations/using-integrations) for more details around using resource objects."
]
},
{
Expand Down Expand Up @@ -599,8 +599,16 @@
"\n",
"---\n",
"\n",
"There is a lot more you can do with Aqueduct, including having flows run automatically on a cadence, parameterizing flows, and reading to and writing from many different integrations (S3, Postgres, etc.). Check out the other tutorials and examples [here](https://docs.aqueducthq.com/example-workflows) for a deeper dive!"
"There is a lot more you can do with Aqueduct, including having flows run automatically on a cadence, parameterizing flows, and reading to and writing from many different data resources (S3, Postgres, etc.). Check out the other tutorials and examples [here](https://docs.aqueducthq.com/example-workflows) for a deeper dive!"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "80d87b8a",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -64,7 +64,7 @@
"metadata": {},
"outputs": [],
"source": [
"demodb = client.integration(\"aqueduct_demo\")\n",
"demodb = client.resource(\"aqueduct_demo\")\n",
"\n",
"# wines is an Aqueduct TableArtifact, which is a wrapper around\n",
"# a Pandas DataFrame. A TableArtifact can be used as argument to any operator\n",
Expand Down
12 changes: 6 additions & 6 deletions examples/wine-ratings-prediction/Spark Wine Ratings Demo.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -13,9 +13,9 @@
"\n",
"**Throughout this notebook, you'll see a decorator (`@aq.op`) above functions. This decorator allows Aqueduct to run your functions as a part of a workflow automatically.**\n",
"\n",
"**To run this notebook, you will have to connect the following integrations:**\n",
"- A Databricks or Spark integration\n",
"- A data integration (ie Snowflake)\n",
"**To run this notebook, you will have to connect the following resources:**\n",
"- A Databricks or Spark compute resource\n",
"- A data resource (ie Snowflake)\n",
"- S3 (must also be used as metadata store)"
]
},
Expand Down Expand Up @@ -48,7 +48,7 @@
"metadata": {},
"outputs": [],
"source": [
"aqueduct.global_config({'engine': '<spark or databricks integration>', 'lazy': True})"
"aqueduct.global_config({'engine': '<spark or databricks resource>', 'lazy': True})"
]
},
{
Expand All @@ -68,7 +68,7 @@
"metadata": {},
"outputs": [],
"source": [
"snowflake_warehouse = client.integration(\"<snowflake integration>\")\n",
"snowflake_warehouse = client.resource(\"<snowflake resource>\")\n",
"wine_table = snowflake_warehouse.sql(\"select * from wine;\")"
]
},
Expand Down Expand Up @@ -371,7 +371,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.0"
"version": "3.8.13"
}
},
"nbformat": 4,
Expand Down
2 changes: 1 addition & 1 deletion integration_tests/backend/setup/changing_saves_workflow.py
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@
def setup_changing_saves(client: aqueduct.Client, integration_name: str) -> str:
name = "Test: Changing Saves"
n_runs = 4
integration = client.integration(name=integration_name)
integration = client.resource(name=integration_name)

###
table = integration.sql(query="SELECT * FROM wine;")
Expand Down
2 changes: 1 addition & 1 deletion integration_tests/backend/setup/flow_with_failure.py
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@
def setup_flow_with_failure(client: aqueduct.Client, integration_name: str) -> str:
name = "Test: Flow with Failure"
n_runs = 1
integration = client.integration(name=integration_name)
integration = client.resource(name=integration_name)

@aqueduct.op
def bad_op(df):
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ def setup_flow_with_metrics_and_checks(
) -> str:
name = workflow_name if workflow_name else "Test: Flow with Metrics and Checks"
n_runs = 2
integration = client.integration(name=integration_name)
integration = client.resource(name=integration_name)

@aqueduct.metric
def size(df):
Expand Down
2 changes: 1 addition & 1 deletion integration_tests/backend/setup/flow_with_sleep.py
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@
def setup_flow_with_sleep(client: aqueduct.Client, integration_name: str) -> str:
name = "Test: Flow with Sleep"
n_runs = 1
integration = client.integration(name=integration_name)
integration = client.resource(name=integration_name)

@aqueduct.op
def sleeping_op(df):
Expand Down
2 changes: 1 addition & 1 deletion integration_tests/backend/test_reads.py
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,7 @@ class TestBackend:
@classmethod
def setup_class(cls):
cls.client = aqueduct.Client(pytest.api_key, pytest.server_address)
cls.integration = cls.client.integration(name=pytest.integration)
cls.integration = cls.client.resource(name=pytest.integration)
cls.flows = {
"changing_saves": setup_changing_saves(cls.client, pytest.integration),
"flow_with_failure": setup_flow_with_failure(cls.client, pytest.integration),
Expand Down
Loading