Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixes doc generation script #1285

Merged
merged 4 commits into from
May 5, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
63 changes: 43 additions & 20 deletions scripts/generate_docs.sh
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@ echo "### package aqueduct
* [\`error\`](https://docs.aqueducthq.com/api-reference/sdk-reference/package-aqueduct/aqueduct.error)
* [\`flow\`](https://docs.aqueducthq.com/api-reference/sdk-reference/package-aqueduct/aqueduct.flow)
* [\`schedule\`](https://docs.aqueducthq.com/api-reference/sdk-reference/package-aqueduct/aqueduct.schedule)
* [\`llm_op\`](https://docs.aqueducthq.com/api-reference/sdk-reference/package-aqueduct/aqueduct.llm_op)
### package aqueduct.artifacts
* [\`bool_artifact\`](https://docs.aqueducthq.com/api-reference/sdk-reference/package-aqueduct/package-aqueduct.artifacts/aqueduct.artifacts.bool_artifact)
* [\`generic_artifact\`](https://docs.aqueducthq.com/api-reference/sdk-reference/package-aqueduct/package-aqueduct.artifacts/aqueduct.artifacts.generic_artifact)
Expand All @@ -21,19 +22,27 @@ echo "### package aqueduct
### package aqueduct.models
* [\`models.integration\`](https://docs.aqueducthq.com/api-reference/sdk-reference/package-aqueduct/package-aqueduct.models/aqueduct.models.integration)
* [\`models.operators\`](https://docs.aqueducthq.com/api-reference/sdk-reference/package-aqueduct/package-aqueduct.models/aqueduct.models.operators)
### package aqueduct.integrations
* [\`integrations.dynamic_k8s_integration\`](https://docs.aqueducthq.com/api-reference/sdk-reference/package-aqueduct/package-aqueduct.integrations/aqueduct.integrations.dynamic_k8s_integration)
* [\`integrations.google_sheets_integration\`](https://docs.aqueducthq.com/api-reference/sdk-reference/package-aqueduct/package-aqueduct.integrations/aqueduct.integrations.google_sheets_integration)
* [\`integrations.s3_integration\`](https://docs.aqueducthq.com/api-reference/sdk-reference/package-aqueduct/package-aqueduct.integrations/aqueduct.integrations.s3_integration)
* [\`integrations.salesforce_integration\`](https://docs.aqueducthq.com/api-reference/sdk-reference/package-aqueduct/package-aqueduct.integrations/aqueduct.integrations.salesforce_integration)
* [\`integrations.sql_integration\`](https://docs.aqueducthq.com/api-reference/sdk-reference/package-aqueduct/package-aqueduct.integrations/aqueduct.integrations.sql_integration)" > docs/README.md
### package aqueduct.resources
* [\`resources.airflow\`](https://docs.aqueducthq.com/api-reference/sdk-reference/package-aqueduct/package-aqueduct.resources/aqueduct.resources.airflow)
* [\`resources.aws_lambda\`](https://docs.aqueducthq.com/api-reference/sdk-reference/package-aqueduct/package-aqueduct.resources/aqueduct.resources.aws_lambda)
* [\`resources.databricks\`](https://docs.aqueducthq.com/api-reference/sdk-reference/package-aqueduct/package-aqueduct.resources/aqueduct.resources.databricks)
* [\`resources.dynamic\_k8s\`](https://docs.aqueducthq.com/api-reference/sdk-reference/package-aqueduct/package-aqueduct.resources/aqueduct.resources.dynamic\_k8s)
* [\`resources.ecr\`](https://docs.aqueducthq.com/api-reference/sdk-reference/package-aqueduct/package-aqueduct.resources/aqueduct.resources.ecr)
* [\`resources.k8s\`](https://docs.aqueducthq.com/api-reference/sdk-reference/package-aqueduct/package-aqueduct.resources/aqueduct.resources.k8s)
* [\`resources.google\_sheets\`](https://docs.aqueducthq.com/api-reference/sdk-reference/package-aqueduct/package-aqueduct.resources/aqueduct.resources.google\_sheets)
* [\`resources.mongodb\`](https://docs.aqueducthq.com/api-reference/sdk-reference/package-aqueduct/package-aqueduct.resources/aqueduct.resources.mongodb)
* [\`resources.s3\`](https://docs.aqueducthq.com/api-reference/sdk-reference/package-aqueduct/package-aqueduct.resources/aqueduct.resources.s3)
* [\`resources.salesforce\`](https://docs.aqueducthq.com/api-reference/sdk-reference/package-aqueduct/package-aqueduct.resources/aqueduct.resources.salesforce)
* [\`resources.spark\`](https://docs.aqueducthq.com/api-reference/sdk-reference/package-aqueduct/package-aqueduct.resources/aqueduct.resources.spark)
* [\`resources.sql\`](https://docs.aqueducthq.com/api-reference/sdk-reference/package-aqueduct/package-aqueduct.resources/aqueduct.resources.sql)" > docs/README.md


pydoc-markdown -I . --render-toc -m aqueduct.client > docs/aqueduct.client.md
pydoc-markdown -I . --render-toc -m aqueduct.decorator > docs/aqueduct.decorator.md
pydoc-markdown -I . --render-toc -m aqueduct.error > docs/aqueduct.error.md
pydoc-markdown -I . --render-toc -m aqueduct.flow > docs/aqueduct.flow.md
pydoc-markdown -I . --render-toc -m aqueduct.schedule > docs/aqueduct.schedule.md
pydoc-markdown -I . --render-toc -m aqueduct.llm_wrapper > docs/aqueduct.llm_op.md

mkdir docs/package-aqueduct.artifacts

Expand Down Expand Up @@ -68,17 +77,31 @@ echo "### package aqueduct.models
pydoc-markdown -I . --render-toc -m aqueduct.models.integration > docs/package-aqueduct.models/aqueduct.models.integration.md
pydoc-markdown -I . --render-toc -m aqueduct.models.operators > docs/package-aqueduct.models/aqueduct.models.operators.md

mkdir docs/package-aqueduct.integrations

echo "### package aqueduct.integrations
* [\`integrations.dynamic_k8s_integration\`](https://docs.aqueducthq.com/api-reference/sdk-reference/package-aqueduct/package-aqueduct.integrations/aqueduct.integrations.dynamic_k8s_integration)
* [\`integrations.google_sheets_integration\`](https://docs.aqueducthq.com/api-reference/sdk-reference/package-aqueduct/package-aqueduct.integrations/aqueduct.integrations.google_sheets_integration)
* [\`integrations.s3_integration\`](https://docs.aqueducthq.com/api-reference/sdk-reference/package-aqueduct/package-aqueduct.integrations/aqueduct.integrations.s3_integration)
* [\`integrations.salesforce_integration\`](https://docs.aqueducthq.com/api-reference/sdk-reference/package-aqueduct/package-aqueduct.integrations/aqueduct.integrations.salesforce_integration)
* [\`integrations.sql_integration\`](https://docs.aqueducthq.com/api-reference/sdk-reference/package-aqueduct/package-aqueduct.integrations/aqueduct.integrations.sql_integration)" > docs/package-aqueduct.integrations/README.md

pydoc-markdown -I . --render-toc -m aqueduct.integrations.dynamic_k8s_integration > docs/package-aqueduct.integrations/aqueduct.integrations.dynamic_k8s_integration.md
pydoc-markdown -I . --render-toc -m aqueduct.integrations.google_sheets_integration > docs/package-aqueduct.integrations/aqueduct.integrations.google_sheets_integration.md
pydoc-markdown -I . --render-toc -m aqueduct.integrations.s3_integration > docs/package-aqueduct.integrations/aqueduct.integrations.s3_integration.md
pydoc-markdown -I . --render-toc -m aqueduct.integrations.salesforce_integration > docs/package-aqueduct.integrations/aqueduct.integrations.salesforce_integration.md
pydoc-markdown -I . --render-toc -m aqueduct.integrations.sql_integration > docs/package-aqueduct.integrations/aqueduct.integrations.sql_integration.md
mkdir docs/package-aqueduct.resources

echo "### package aqueduct.resources
* [\`resources.airflow\`](https://docs.aqueducthq.com/api-reference/sdk-reference/package-aqueduct/package-aqueduct.resources/aqueduct.resources.airflow)
* [\`resources.aws_lambda\`](https://docs.aqueducthq.com/api-reference/sdk-reference/package-aqueduct/package-aqueduct.resources/aqueduct.resources.aws_lambda)
* [\`resources.databricks\`](https://docs.aqueducthq.com/api-reference/sdk-reference/package-aqueduct/package-aqueduct.resources/aqueduct.resources.databricks)
* [\`resources.dynamic\_k8s\`](https://docs.aqueducthq.com/api-reference/sdk-reference/package-aqueduct/package-aqueduct.resources/aqueduct.resources.dynamic\_k8s)
* [\`resources.ecr\`](https://docs.aqueducthq.com/api-reference/sdk-reference/package-aqueduct/package-aqueduct.resources/aqueduct.resources.ecr)
* [\`resources.k8s\`](https://docs.aqueducthq.com/api-reference/sdk-reference/package-aqueduct/package-aqueduct.resources/aqueduct.resources.k8s)
* [\`resources.google\_sheets\`](https://docs.aqueducthq.com/api-reference/sdk-reference/package-aqueduct/package-aqueduct.resources/aqueduct.resources.google\_sheets)
* [\`resources.mongodb\`](https://docs.aqueducthq.com/api-reference/sdk-reference/package-aqueduct/package-aqueduct.resources/aqueduct.resources.mongodb)
* [\`resources.s3\`](https://docs.aqueducthq.com/api-reference/sdk-reference/package-aqueduct/package-aqueduct.resources/aqueduct.resources.s3)
* [\`resources.salesforce\`](https://docs.aqueducthq.com/api-reference/sdk-reference/package-aqueduct/package-aqueduct.resources/aqueduct.resources.salesforce)
* [\`resources.spark\`](https://docs.aqueducthq.com/api-reference/sdk-reference/package-aqueduct/package-aqueduct.resources/aqueduct.resources.spark)
* [\`resources.sql\`](https://docs.aqueducthq.com/api-reference/sdk-reference/package-aqueduct/package-aqueduct.resources/aqueduct.resources.sql)" > docs/package-aqueduct.resources/README.md

pydoc-markdown -I . --render-toc -m aqueduct.resources.airflow > docs/package-aqueduct.resources/aqueduct.resources.airflow.md
pydoc-markdown -I . --render-toc -m aqueduct.resources.aws_lambda > docs/package-aqueduct.resources/aqueduct.resources.aws_lambda.md
pydoc-markdown -I . --render-toc -m aqueduct.resources.databricks > docs/package-aqueduct.resources/aqueduct.resources.databricks.md
pydoc-markdown -I . --render-toc -m aqueduct.resources.dynamic_k8s > docs/package-aqueduct.resources/aqueduct.resources.dynamic_k8s.md
pydoc-markdown -I . --render-toc -m aqueduct.resources.ecr > docs/package-aqueduct.resources/aqueduct.resources.ecr.md
pydoc-markdown -I . --render-toc -m aqueduct.resources.k8s > docs/package-aqueduct.resources/aqueduct.resources.k8s.md
pydoc-markdown -I . --render-toc -m aqueduct.resources.google_sheets > docs/package-aqueduct.resources/aqueduct.resources.google_sheets.md
pydoc-markdown -I . --render-toc -m aqueduct.resources.mongodb > docs/package-aqueduct.resources/aqueduct.resources.mongodb.md
pydoc-markdown -I . --render-toc -m aqueduct.resources.s3 > docs/package-aqueduct.resources/aqueduct.resources.s3.md
pydoc-markdown -I . --render-toc -m aqueduct.resources.salesforce > docs/package-aqueduct.resources/aqueduct.resources.salesforce.md
pydoc-markdown -I . --render-toc -m aqueduct.resources.spark > docs/package-aqueduct.resources/aqueduct.resources.spark.md
pydoc-markdown -I . --render-toc -m aqueduct.resources.sql > docs/package-aqueduct.resources/aqueduct.resources.sql.md
2 changes: 2 additions & 0 deletions scripts/release/doc.sh
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,8 @@ GITBOOK_REPO=$HOME/gitbook
cd ~/aqueduct/sdk

bash ../scripts/generate_docs.sh
rm -rf $GITBOOK_REPO/api-reference/sdk-reference/package-aqueduct
mkdir -p $GITBOOK_REPO/api-reference/sdk-reference/package-aqueduct
cp -r docs/* $GITBOOK_REPO/api-reference/sdk-reference/package-aqueduct
rm -r docs/

Expand Down
31 changes: 15 additions & 16 deletions sdk/aqueduct/llm_wrapper.py
Original file line number Diff line number Diff line change
Expand Up @@ -130,7 +130,6 @@ def llm_op(
]:
"""Generates an Aqueduct operator to run a LLM. Either both column_name and output_column_name must be provided,
or neither must be provided. Please refer to the `Returns` section below for their differences.

Args:
name:
The name of the LLM to use. Please see aqueduct.supported_llms for a list of supported LLMs.
Expand All @@ -142,55 +141,55 @@ def llm_op(
output_column_name:
The name of the column of the Dataframe to store the output of the LLM. If this field is provided,
column_name must also be provided.

engine:
The name of the compute integration this operator will run on. Defaults to the Aqueduct engine.
The name of the compute resource this operator will run on. Defaults to the Aqueduct engine.
We recommend using a Kubernetes engine to run LLM operators, as we have implemented performance
optimizations for LLMs on Kubernetes.

Returns:
If column_name and output_column_name are both provided, returns a function that takes in a
DataFrame and returns a DataFrame with the output of the LLM appended as a new column:

def llm_for_table(df: pd.DataFrame, parameters: Dict[str, Any] = {}) -> pd.DataFrame:

```python
def use_llm_for_table(df: pd.DataFrame, parameters: Dict[str, Any] = {}) -> pd.DataFrame:
```
Otherwise, returns a function that takes in a string or list of strings, applies LLM, and
returns a string or list of strings:

```python
def use_llm(messages: Union[str, List[str]], parameters: Dict[str, Any] = {}) -> Union[str, List[str]]:

```
In both cases, the function takes in an optional second argument, which is a dictionary of
parameters to pass to the LLM. Please refer to the documentation for the LLM you are using
for a list of supported parameters. For all LLMs, we support the "prompt" parameter. If the
prompt contains {text}, we will replace {text} with the input string(s) before sending to
the LLM. If the prompt does not contain {text}, we will prepend the prompt to the input
string(s) before sending to the LLM.

Examples:
```python
>>> from aqueduct import Client
>>> client = Client()
>>> snowflake = client.resource("snowflake")
>>> reviews_table = snowflake.sql("select * from hotel_reviews;")

>>> from aqueduct import llm_op
... vicuna_table_op = llm_op(
>>> vicuna_table_op = llm_op(
... name="vicuna_7b",
... op_name="my_vicuna_operator",
... column_name="review",
... output_column_name="response",
... engine=ondemand_k8s,
>>> )
... params = client.create_param(
... )
>>> params = client.create_param(
... "vicuna_params",
... default={
... "prompt": "Respond to the following hotel review as a customer service agent: {text} ",
... "max_gpu_memory": "13GiB",
... "temperature": 0.7,
... "max_new_tokens": 512,
... }
>>> )
... )
>>> review_with_response = vicuna_table_op(reviews_table, params)

`review_with_response` is a Table Artifact with the output of the LLM appended as a new column.

>>> review_with_response.get()
```
"""
if name not in supported_llms:
raise InvalidUserArgumentException(f"Unsupported LLM model {name}")
Expand Down