Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Eng 1086 create syntax sugar for sql queries #103

Merged
merged 10 commits into from
Jun 16, 2022
Merged
Show file tree
Hide file tree
Changes from 4 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 10 additions & 0 deletions integration_tests/sdk/sql_integration_test.py
Original file line number Diff line number Diff line change
Expand Up @@ -37,3 +37,13 @@ def test_invalid_destination_integration(client):
output_artifact.save(
config=db.config(table=generate_table_name(), update_mode=LoadUpdateMode.REPLACE)
)


def test_sugar_syntax_sql(client):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lets be more specific with the test name. test_sql_today_tag

db = client.integration(name=get_integration_name())
sql_artifact_today = db.sql(query="select * from hotel_reviews where review_date = {{today}}")
assert sql_artifact_today.get().empty
sql_artifact_not_today = db.sql(
query="select * from hotel_reviews where review_date != {{today}}"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since this is historical data, can we just do < today? Not sure if thats valid syntax

)
assert len(sql_artifact_not_today.get()) == 100
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
from typing import List

import pandas as pd
from datetime import date
from sqlalchemy import engine, inspect
from sqlalchemy.exc import SQLAlchemyError

Expand All @@ -24,7 +25,12 @@ def discover(self) -> List[str]:
return inspect(self.engine).get_table_names()

def extract(self, params: extract.RelationalParams) -> pd.DataFrame:
df = pd.read_sql(params.query, con=self.engine)
query = params.query
if "{{today}}" in query:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think variation like {{ today }} and {{today }} should also be valid. Let's look for whats in between {{ and }} and strip out the whitespace. Then we take all the arguments extracted in this fashion and try to match it against a predefined map from [arg name] to func() -> str which replaces that string.

This will let us extend this easily to other sql arguments besides today.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There might be better way of doing this, but you can just keep track of the indexes in the string that the argument starts and ends at, eg: [(5, 10), (15, 20), (36, 55)], then perform the string replacements in reverse.

The point is that we will want to support in the future 1) other types of sql arguments and 2) multiple sql arguments in a single statement

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I think that'll be ideal. I am trying to implement a version using RegEx so it extracts all the curly brace.

today_python = date.today()
today_sql = "'" + today_python.strftime("%Y-%m-%d") + "'"
query = query.replace("{{today}}", today_sql)
df = pd.read_sql(query, con=self.engine)
return df

def load(self, params: load.RelationalParams, df: pd.DataFrame) -> None:
Expand Down