-
Notifications
You must be signed in to change notification settings - Fork 180
Oracle adb integration #1392
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Oracle adb integration #1392
Changes from all commits
f5428bc
69f277d
7851d5f
f7f7ed1
6104a1b
ea9e372
11a01f6
2e693f9
5ae340d
23c2736
3b12b8b
9473142
6dc9f87
d947dd3
cb8e8bb
6ff4401
511b15a
6884dd7
b419f18
398dacd
bb07ffe
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -81,6 +81,22 @@ OpenTelemetry zero-code instrumentation is highly configurable. You can set the | |
|
|
||
| [Metrics we are tracing right now](./Metrics.md) | ||
|
|
||
| ## Oracle 19c Compatibility | ||
|
|
||
| WrenAI includes a custom Oracle 19c dialect override for generating compatible SQL syntax. See [Oracle 19c Dialect Documentation](docs/oracle_19c_dialect.md) for details. | ||
|
|
||
| ### Testing Oracle 19c Dialect | ||
|
|
||
| Run the Oracle 19c test suite (34 tests): | ||
| ```bash | ||
| # Using alias (if configured) | ||
| wren-test-oracle | ||
|
Comment on lines
+92
to
+93
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 🧩 Analysis chain🏁 Script executed: # Search for the wren-test-oracle alias and related configuration
rg 'wren-test-oracle' --type-list
# Also search in common config files
fd -type f \( -name 'Makefile' -o -name 'justfile' -o -name '*.sh' -o -name '.*rc' \) | head -20Repository: Canner/wren-engine Length of output: 622 🏁 Script executed: # Check if ibis-server/README.md exists and read the relevant section
if [ -f "ibis-server/README.md" ]; then
echo "=== Lines 85-100 of ibis-server/README.md ==="
sed -n '85,100p' ibis-server/README.md
echo ""
echo "=== Searching for wren-test-oracle in entire README ==="
grep -n 'wren-test-oracle' ibis-server/README.md || echo "Not found in README"
fiRepository: Canner/wren-engine Length of output: 643 🏁 Script executed: # Search for wren-test-oracle definition
rg 'wren-test-oracle' -A 3 -B 3
# Also check common configuration files
echo "=== Checking for Makefile ==="
find . -name 'Makefile' -o -name 'justfile' | head -10
echo ""
echo "=== Checking for shell scripts with alias definitions ==="
find . -type f -name '*.sh' | xargs grep -l 'alias' 2>/dev/null | head -10
echo ""
echo "=== Checking for .bashrc or .zshrc patterns ==="
find . -type f \( -name '.bashrc' -o -name '.zshrc' -o -name '.aliases' \) 2>/dev/nullRepository: Canner/wren-engine Length of output: 2008 🏁 Script executed: # Check the justfile for alias or test commands
echo "=== ibis-server/justfile ==="
cat ibis-server/justfile
echo ""
echo "=== oracle_19c_dialect.md - Check configuration section ==="
sed -n '1,100p' ibis-server/docs/oracle_19c_dialect.md | tail -60Repository: Canner/wren-engine Length of output: 3308 Add setup instructions or link to oracle_19c_dialect.md documentation. The
🤖 Prompt for AI Agents |
||
|
|
||
| # Full command | ||
| cd ibis-server | ||
| PYTHONPATH=. poetry run pytest tests/custom_sqlglot/ -m oracle19c -v --confcutdir=tests/custom_sqlglot | ||
| ``` | ||
|
|
||
| ## Contributing | ||
| Please see [CONTRIBUTING.md](docs/CONTRIBUTING.md) for more information. | ||
|
|
||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1,3 +1,4 @@ | ||
| # ruff: noqa: F401 | ||
|
|
||
| from app.custom_sqlglot.dialects.mysql import MySQL | ||
| from app.custom_sqlglot.dialects.oracle import Oracle |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,93 @@ | ||
| from loguru import logger | ||
| from sqlglot import exp | ||
| from sqlglot.dialects.oracle import Oracle as OriginalOracle | ||
|
|
||
|
|
||
| class Oracle(OriginalOracle): | ||
| """ | ||
| Custom Oracle dialect for Oracle 19c compatibility. | ||
|
|
||
| Overrides SQLGlot's default Oracle dialect to fix specific Oracle 19c issues: | ||
| - TIMESTAMPTZ → TIMESTAMP type mapping (avoids timezone format issues) | ||
| - CAST timestamp literals with explicit TO_TIMESTAMP format (fixes ORA-01843) | ||
| - BOOLEAN → CHAR(1) type mapping (user's boolean representation pattern) | ||
|
|
||
| Note: INTERVAL syntax is fully supported in Oracle 19c and does not need transformation. | ||
|
|
||
| Based on SQLGlot version >=23.4,<26.5 | ||
| """ | ||
|
|
||
| class Generator(OriginalOracle.Generator): | ||
| """Custom generator for Oracle 19c SQL syntax.""" | ||
|
|
||
| TYPE_MAPPING = { | ||
| **OriginalOracle.Generator.TYPE_MAPPING, | ||
| # Oracle 19c doesn't have native BOOLEAN type (21c+ feature) | ||
| # Map to CHAR(1) to match our 'Y'/'N' boolean representation pattern | ||
| exp.DataType.Type.BOOLEAN: "CHAR(1)", | ||
| # Map TIMESTAMPTZ to TIMESTAMP (without timezone) for Oracle 19c | ||
| # Avoids format conversion issues with TIMESTAMP WITH TIME ZONE | ||
| exp.DataType.Type.TIMESTAMPTZ: "TIMESTAMP", | ||
| } | ||
|
|
||
| TRANSFORMS = { | ||
| **OriginalOracle.Generator.TRANSFORMS, | ||
| # Handle CAST to TIMESTAMP with explicit format for string literals (ORA-01843 fix) | ||
| exp.Cast: lambda self, e: self._handle_cast_oracle19c(e), | ||
| } | ||
|
|
||
| def __init__(self, *args, **kwargs): | ||
| """Initialize Oracle 19c generator with logging.""" | ||
| super().__init__(*args, **kwargs) | ||
| logger.debug("Using custom Oracle 19c dialect for SQL generation") | ||
|
|
||
| def _handle_cast_oracle19c(self, expression: exp.Cast) -> str: | ||
| """ | ||
| Handle CAST expressions for Oracle 19c timestamp compatibility. | ||
|
|
||
| Oracle 19c cannot implicitly convert string literals like '2025-11-24 00:00:00' | ||
| when casting to TIMESTAMP. This transform converts: | ||
|
|
||
| CAST('2025-11-24 00:00:00' AS TIMESTAMP) | ||
| → TO_TIMESTAMP('2025-11-24 00:00:00', 'YYYY-MM-DD HH24:MI:SS') | ||
|
|
||
| Only applies when: | ||
| - Source is a string literal | ||
| - Target type is TIMESTAMP or DATE | ||
| - Literal matches YYYY-MM-DD pattern | ||
|
|
||
| Args: | ||
| expression: Cast expression node | ||
|
|
||
| Returns: | ||
| Oracle 19c-compatible SQL string | ||
| """ | ||
| source = expression.this | ||
| target_type = expression.to | ||
|
|
||
| # Check if we're casting to TIMESTAMP or DATE | ||
| if target_type and target_type.this in (exp.DataType.Type.TIMESTAMP, exp.DataType.Type.DATE): | ||
| # Check if source is a string literal | ||
| if isinstance(source, exp.Literal) and source.is_string: | ||
| literal_value = source.this | ||
|
|
||
| # Check if it matches YYYY-MM-DD pattern (with or without time) | ||
| # Pattern: YYYY-MM-DD or YYYY-MM-DD HH:MI:SS | ||
| if literal_value and len(literal_value) >= 10 and literal_value[4] == '-' and literal_value[7] == '-': | ||
| # Determine format based on length | ||
| if len(literal_value) == 10: | ||
| # Just date: YYYY-MM-DD | ||
| format_mask = 'YYYY-MM-DD' | ||
| elif len(literal_value) == 19: | ||
| # Date with time: YYYY-MM-DD HH:MI:SS | ||
| format_mask = 'YYYY-MM-DD HH24:MI:SS' | ||
| else: | ||
| # Other length, use default CAST | ||
| return self.cast_sql(expression) | ||
|
|
||
| # Use TO_TIMESTAMP or TO_DATE with explicit format | ||
| func_name = "TO_TIMESTAMP" if target_type.this == exp.DataType.Type.TIMESTAMP else "TO_DATE" | ||
| return f"{func_name}('{literal_value}', '{format_mask}')" | ||
|
|
||
| # For all other cases, use default CAST behavior | ||
| return self.cast_sql(expression) |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -203,19 +203,24 @@ class PostgresConnectionInfo(BaseConnectionInfo): | |
|
|
||
|
|
||
| class OracleConnectionInfo(BaseConnectionInfo): | ||
| host: SecretStr = Field( | ||
| examples=["localhost"], description="the hostname of your database" | ||
| host: SecretStr | None = Field( | ||
| examples=["localhost"], description="the hostname of your database", default=None | ||
| ) | ||
| port: SecretStr = Field(examples=[1521], description="the port of your database") | ||
| database: SecretStr = Field( | ||
| examples=["orcl"], description="the database name of your database" | ||
| port: SecretStr | None = Field(examples=[1521], description="the port of your database", default=None) | ||
| database: SecretStr | None = Field( | ||
| examples=["orcl"], description="the database name of your database", default=None | ||
| ) | ||
| user: SecretStr = Field( | ||
| examples=["admin"], description="the username of your database" | ||
| ) | ||
| password: SecretStr | None = Field( | ||
| examples=["password"], description="the password of your database", default=None | ||
| ) | ||
| dsn: SecretStr | None = Field( | ||
| examples=["(DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=host)(PORT=port))(CONNECT_DATA=(SERVICE_NAME=service)))"], | ||
| description="Oracle Data Source Name (DSN) - Alternative to host/port/database configuration", | ||
| default=None | ||
| ) | ||
|
Comment on lines
+206
to
+223
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Missing validation: either DSN or host/port/database must be provided. The current model allows all connection parameters ( Consider adding a Pydantic 🔎 Proposed fix using Pydantic model validator+from pydantic import model_validator
class OracleConnectionInfo(BaseConnectionInfo):
host: SecretStr | None = Field(
examples=["localhost"], description="the hostname of your database", default=None
)
port: SecretStr | None = Field(examples=[1521], description="the port of your database", default=None)
database: SecretStr | None = Field(
examples=["orcl"], description="the database name of your database", default=None
)
user: SecretStr = Field(
examples=["admin"], description="the username of your database"
)
password: SecretStr | None = Field(
examples=["password"], description="the password of your database", default=None
)
dsn: SecretStr | None = Field(
examples=["(DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=host)(PORT=port))(CONNECT_DATA=(SERVICE_NAME=service)))"],
description="Oracle Data Source Name (DSN) - Alternative to host/port/database configuration",
default=None
)
+
+ @model_validator(mode='after')
+ def validate_connection_params(self):
+ if not self.dsn and not (self.host and self.port and self.database):
+ raise ValueError("Either 'dsn' or all of 'host', 'port', and 'database' must be provided")
+ return self
🤖 Prompt for AI Agents |
||
|
|
||
|
|
||
| class SnowflakeConnectionInfo(BaseConnectionInfo): | ||
|
|
@@ -261,10 +266,10 @@ class TrinoConnectionInfo(BaseConnectionInfo): | |
|
|
||
| class LocalFileConnectionInfo(BaseConnectionInfo): | ||
| url: SecretStr = Field( | ||
| description="the root path of the local file", default="/", examples=["/data"] | ||
| description="the root path of the local file", examples=["/data"] | ||
| ) | ||
| format: str = Field( | ||
| description="File format", default="csv", examples=["csv", "parquet", "json"] | ||
| description="File format", examples=["csv", "parquet", "json"] | ||
| ) | ||
|
|
||
|
|
||
|
|
||
| Original file line number | Diff line number | Diff line change | ||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| @@ -1,5 +1,6 @@ | ||||||||||||||
| import base64 | ||||||||||||||
| import importlib | ||||||||||||||
| import logging | ||||||||||||||
| from functools import cache | ||||||||||||||
| from json import loads | ||||||||||||||
| from typing import Any | ||||||||||||||
|
|
@@ -17,6 +18,8 @@ | |||||||||||||
| from ibis.backends.sql.compilers.postgres import compiler as postgres_compiler | ||||||||||||||
| from opentelemetry import trace | ||||||||||||||
|
|
||||||||||||||
| logger = logging.getLogger(__name__) | ||||||||||||||
|
|
||||||||||||||
| from app.model import ( | ||||||||||||||
| ConnectionInfo, | ||||||||||||||
| GcsFileConnectionInfo, | ||||||||||||||
|
|
@@ -31,6 +34,10 @@ | |||||||||||||
| # Override datatypes of ibis | ||||||||||||||
| importlib.import_module("app.custom_ibis.backends.sql.datatypes") | ||||||||||||||
|
|
||||||||||||||
| # Apply Oracle backend patch to fix ORA-00923 error | ||||||||||||||
| from app.model.oracle_backend_patch import patch_oracle_backend | ||||||||||||||
| patch_oracle_backend() | ||||||||||||||
|
|
||||||||||||||
| tracer = trace.get_tracer(__name__) | ||||||||||||||
|
|
||||||||||||||
|
|
||||||||||||||
|
|
@@ -70,10 +77,16 @@ def __init__(self, data_source: DataSource, connection_info: ConnectionInfo): | |||||||||||||
|
|
||||||||||||||
| @tracer.start_as_current_span("connector_query", kind=trace.SpanKind.CLIENT) | ||||||||||||||
| def query(self, sql: str, limit: int) -> pd.DataFrame: | ||||||||||||||
| import sys | ||||||||||||||
| print(f"🔍 CONNECTOR QUERY: {sql}", file=sys.stderr, flush=True) | ||||||||||||||
| print(f"🔍 LIMIT: {limit}", file=sys.stderr, flush=True) | ||||||||||||||
| logger.info(f"🔍 CONNECTOR QUERY: {sql}") | ||||||||||||||
| logger.info(f"🔍 LIMIT: {limit}") | ||||||||||||||
|
Comment on lines
+80
to
+84
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Remove temporary debug logging before merging. These debug print statements appear to be temporary debugging code that should be removed before production:
Suggested cleanup @tracer.start_as_current_span("connector_query", kind=trace.SpanKind.CLIENT)
def query(self, sql: str, limit: int) -> pd.DataFrame:
- import sys
- print(f"🔍 CONNECTOR QUERY: {sql}", file=sys.stderr, flush=True)
- print(f"🔍 LIMIT: {limit}", file=sys.stderr, flush=True)
- logger.info(f"🔍 CONNECTOR QUERY: {sql}")
- logger.info(f"🔍 LIMIT: {limit}")
+ logger.debug(f"Executing query with limit {limit}")
return self.connection.sql(sql).limit(limit).to_pandas()If query logging is needed for debugging, consider:
📝 Committable suggestion
Suggested change
🤖 Prompt for AI Agents |
||||||||||||||
| return self.connection.sql(sql).limit(limit).to_pandas() | ||||||||||||||
|
|
||||||||||||||
| @tracer.start_as_current_span("connector_dry_run", kind=trace.SpanKind.CLIENT) | ||||||||||||||
| def dry_run(self, sql: str) -> None: | ||||||||||||||
| logger.info(f"🔍 CONNECTOR DRY_RUN: {sql}") | ||||||||||||||
| self.connection.sql(sql) | ||||||||||||||
|
|
||||||||||||||
|
|
||||||||||||||
|
|
||||||||||||||
| Original file line number | Diff line number | Diff line change | ||||||||||||||||||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
|
@@ -181,13 +181,24 @@ def get_postgres_connection(info: PostgresConnectionInfo) -> BaseBackend: | |||||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||||||
| @staticmethod | ||||||||||||||||||||||||||||||||||||||
| def get_oracle_connection(info: OracleConnectionInfo) -> BaseBackend: | ||||||||||||||||||||||||||||||||||||||
| return ibis.oracle.connect( | ||||||||||||||||||||||||||||||||||||||
| host=info.host.get_secret_value(), | ||||||||||||||||||||||||||||||||||||||
| port=int(info.port.get_secret_value()), | ||||||||||||||||||||||||||||||||||||||
| database=info.database.get_secret_value(), | ||||||||||||||||||||||||||||||||||||||
| user=info.user.get_secret_value(), | ||||||||||||||||||||||||||||||||||||||
| password=(info.password and info.password.get_secret_value()), | ||||||||||||||||||||||||||||||||||||||
| ) | ||||||||||||||||||||||||||||||||||||||
| # Build connection parameters | ||||||||||||||||||||||||||||||||||||||
| connect_params = { | ||||||||||||||||||||||||||||||||||||||
| "user": info.user.get_secret_value(), | ||||||||||||||||||||||||||||||||||||||
| "password": (info.password and info.password.get_secret_value()), | ||||||||||||||||||||||||||||||||||||||
| } | ||||||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||||||
| # If DSN is provided, use it directly | ||||||||||||||||||||||||||||||||||||||
| if info.dsn: | ||||||||||||||||||||||||||||||||||||||
| connect_params["dsn"] = info.dsn.get_secret_value() | ||||||||||||||||||||||||||||||||||||||
| else: | ||||||||||||||||||||||||||||||||||||||
| # Otherwise use individual host/port/database parameters | ||||||||||||||||||||||||||||||||||||||
| connect_params.update({ | ||||||||||||||||||||||||||||||||||||||
| "host": info.host.get_secret_value(), | ||||||||||||||||||||||||||||||||||||||
| "port": int(info.port.get_secret_value()), | ||||||||||||||||||||||||||||||||||||||
| "database": info.database.get_secret_value(), | ||||||||||||||||||||||||||||||||||||||
| }) | ||||||||||||||||||||||||||||||||||||||
|
Comment on lines
+193
to
+199
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Potential When 🔎 Proposed fix with explicit validation # If DSN is provided, use it directly
if info.dsn:
connect_params["dsn"] = info.dsn.get_secret_value()
else:
# Otherwise use individual host/port/database parameters
+ if not info.host or not info.port or not info.database:
+ raise ValueError(
+ "Either 'dsn' or all of 'host', 'port', and 'database' must be provided"
+ )
connect_params.update({
"host": info.host.get_secret_value(),
"port": int(info.port.get_secret_value()),
"database": info.database.get_secret_value(),
})📝 Committable suggestion
Suggested change
🤖 Prompt for AI Agents |
||||||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||||||
| return ibis.oracle.connect(**connect_params) | ||||||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||||||
| @staticmethod | ||||||||||||||||||||||||||||||||||||||
| def get_snowflake_connection(info: SnowflakeConnectionInfo) -> BaseBackend: | ||||||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||||||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Version mismatch: Oracle Instant Client 19.23.0.0.0 does not exist; should be 19.3.0.0.0 or consider upgrading to 21+ or 26ai.
The Dockerfile references version 19.23.0.0.0, which conflates Oracle Database 19c Release Update 19.23 with the Instant Client version. The correct Instant Client version in the 19c family is 19.3.0.0.0. Additionally, using a pinned version makes the Docker build fragile if Oracle removes access to older releases. Consider either:
🤖 Prompt for AI Agents