Conversation
VIEWS-ONLY ARCHITECTURE: - Remove table support - discover Oracle VIEWS exclusively - Optimized for view-based reporting databases - Aligns with business model where tables are internal QUICK WINS (Phase 4.2): 1. Dynamic schema owner (fixes hardcoded 'SYSTEM') - Extract user from connection_info dynamically - Supports any Oracle schema, not just SYSTEM 2. Views-only discovery (simplified from UNION ALL) - Query all_views directly, not all_tables - Cleaner SQL, faster execution - Returns exactly 89 views (not 122 mixed objects) 3. Quoted identifier support for view names with spaces - Detects spaces in view names - Wraps in double quotes: SCHEMA."RT Customer" - Critical for 99% of reporting views 4. Permission-safe version detection - Try v$instance, fallback to hardcoded version - Never fails on permission issues - Returns 19.0.0.0.0 for Oracle ADB 19c IMPACT: - Simpler code (single path, not dual) - Faster metadata discovery (no table queries) - Better aligned with reporting database architecture Ref: IMPLEMENTATION_PLAN.md Phase 4.2 Ref: ARCHITECTURAL_DECISION.md Ref: keep_me/oracle_modifications/oracle.py (proven solutions)
- Replaced get_constraints() with intelligent auto-detection algorithm - Query all_dependencies to map views → base tables - Query all_constraints to get FK relationships from base tables - Create view-to-view relationships when views share related tables - Result: Auto-detect 4,841 relationships across 99 views - Zero configuration required - all discovered from database metadata - Handles quoted identifiers for views with spaces - Graceful fallback if permissions fail Tested: 4,841 relationships validated with REPORTS_ASSURANTDEV schema
- Fixed TypeError in oracle_backend_patch.py from_string() call - Removed precision and scale parameters that aren't supported by ibis type_mapper - Simplified to pass only base data type and proper nullable boolean - This fixes all Oracle queries including preview data, natural language queries, etc. - Resolves: SqlParseType.from_string() got unexpected keyword argument 'precision'
…DSN support Oracle 19c Autonomous Database integration for views-only reporting architecture. Key Changes: - Oracle Instant Client 19.23 integration in Dockerfile (Debian Bookworm upgrade) - DSN connection support for Oracle ADB (alternative to host/port/database) - Views-only metadata discovery (no tables exposed, optimized for reporting DBs) - Quoted identifier support for Oracle views/columns with spaces - Disabled relationship auto-detection (returns empty list to prevent 30+ min hangs) Technical Details: - oracle.py: Returns quoted table names like '"RT SN Claim"' (no schema prefix) - data_source.py: DSN parameter support with fallback to host/port/database - connector.py: Oracle backend patch integration, debug logging added - __init__.py: Made host/port/database optional for OracleConnectionInfo - Dockerfile: Oracle Instant Client libs, LD_LIBRARY_PATH, ORACLE_HOME env vars Architecture: - Views represent business entities for reporting queries - Relationships must be manually defined in UI (no FK metadata in views) - Permission-safe version detection (fallback to 19.0.0.0.0) - Dynamic user extraction from connection info (no hardcoded schema names) Testing: - Tested with REPORTS_ASSURANTDEV user on Oracle ADB 19c - 99 views discovered successfully with quoted identifiers preserved - Manual relationship definition confirmed working - Data preview working on views with spaces in names Related: Main repo commit e9332392 contains companion wren-ui changes
Create custom Oracle 19c dialect class to override SQLGlot's default Oracle dialect for 19c compatibility. Implementation: - Created app/custom_sqlglot/dialects/oracle.py - Oracle class inherits from SQLGlot's Oracle dialect - Generator subclass prepared for future overrides - Comprehensive docstring documenting compatibility goals Implements: COMP-001, FR-004 Task: P1.001 - Create Oracle 19c Dialect File Structure Phase: 1 - Foundation Related: SCAIS-23
Register Oracle 19c dialect in dialects module for automatic SQLGlot discovery. Implementation: - Added Oracle import to app/custom_sqlglot/dialects/__init__.py - Follows existing MySQL dialect registration pattern - Enables automatic dialect override when write='oracle' Implements: COMP-005, FR-004 Task: P1.002 - Register Custom Oracle Dialect Phase: 1 - Foundation Depends On: P1.001 Related: SCAIS-23
Add debug logging to track when Oracle 19c dialect is used for SQL generation. Implementation: - Added loguru import to oracle.py - Implemented Generator.__init__() with logging - Debug log message for observability - Maintains parent functionality via super() Implements: COMP-001, INT-002, NFR-005 Task: P1.004 - Add Logging for Dialect Usage Phase: 1 - Foundation (Complete) Depends On: P1.001 Related: SCAIS-23
…ibility - Map BOOLEAN type to CHAR(1) for Oracle 19c compatibility - Oracle 19c doesn't support native BOOLEAN type (21c+ feature) - Follows Y/N convention for boolean representation - Inherits all other type mappings from base Oracle dialect - Updates docstring to reflect CHAR(1) Y/N convention Implements: SCAIS-23 P2.001 (COMP-002, FR-004) Related: Oracle 19c compatibility requirement
- Add _dateadd_oracle19c method to handle DateAdd expressions - Convert INTERVAL DAY syntax to numeric addition (date + n) - Include warning logging for unsupported interval units - Fallback to default behavior for non-INTERVAL expressions - Handles uppercase unit comparison and null safety Implements: SCAIS-23 P2.002 (COMP-003, FR-001) Related: Oracle 19c date arithmetic compatibility
- Add _datesub_oracle19c method to handle DateSub expressions - Convert INTERVAL DAY syntax to numeric subtraction (date - n) - Include warning logging for unsupported interval units - Fallback to default behavior for non-INTERVAL expressions - Mirrors _dateadd_oracle19c pattern for consistency Implements: SCAIS-23 P2.003 (COMP-003, FR-001) Related: Oracle 19c date arithmetic compatibility
- Add MONTH unit support to _dateadd_oracle19c method - Add MONTH unit support to _datesub_oracle19c method - Use ADD_MONTHS() function for month arithmetic - Handle month-end edge cases correctly (e.g., Jan 31 + 1 month = Feb 28/29) - Update docstrings with MONTH examples - Add explanatory comments for edge case handling Implements: SCAIS-23 P2.004 (COMP-003, FR-001) Related: Oracle 19c date arithmetic compatibility
- Add YEAR unit support to _dateadd_oracle19c method - Add YEAR unit support to _datesub_oracle19c method - Convert years to months (n * 12) using ADD_MONTHS() function - Handle leap year edge cases correctly through ADD_MONTHS - Update docstrings with YEAR unit examples - Add explanatory comments for conversion logic Implements: SCAIS-23 P2.005 (COMP-003, FR-001) Related: Oracle 19c date arithmetic compatibility
…onary - Add TRANSFORMS dictionary to Oracle.Generator class - Register exp.DateAdd -> _dateadd_oracle19c mapping - Register exp.DateSub -> _datesub_oracle19c mapping - Inherit base transforms from OriginalOracle.Generator - Use lambda functions for proper instance method binding - Add comments explaining transform registration purpose CRITICAL: This activates all date arithmetic functionality from P2.002-P2.005 SQLGlot will now call custom methods when encountering DateAdd/DateSub AST nodes Implements: SCAIS-23 P2.006 (COMP-003, FR-001) Related: Oracle 19c date arithmetic transform activation
- Create comprehensive pagination validation test suite - Verify LIMIT converts to FETCH FIRST (Oracle 12c+ syntax) - Verify OFFSET syntax compatibility - Test pagination with WHERE, ORDER BY, JOIN clauses - Validate custom dialect preserves pagination behavior - Confirm integration with date arithmetic transforms VALIDATION RESULT: SQLGlot already generates Oracle 19c-compatible pagination syntax. No custom implementation needed. Implements: SCAIS-23 P2.007 (FR-002) Related: Oracle 19c pagination compatibility validation
Fix value extraction bug in Oracle 19c date arithmetic transformations.
Previously, self.sql(interval, 'this') returned quoted strings ('1')
instead of raw numbers (1), causing incorrect SQL generation.
Changes:
- Access interval.this.this directly to get raw Literal value
- Add fallback to self.sql() for non-Literal expressions
- Applies to both _dateadd_oracle19c() and _datesub_oracle19c()
Result:
- DAY: hire_date + INTERVAL '1' DAY → hire_date + 1 (not hire_date + '1')
- MONTH: hire_date + INTERVAL '3' MONTH → ADD_MONTHS(hire_date, 3)
- YEAR: hire_date + INTERVAL '1' YEAR → ADD_MONTHS(hire_date, 1 * 12)
All standalone tests passing (9/9).
Related: SCAIS-23 P3.001
…etic Add unit tests validating date arithmetic transformations for Oracle 19c. Test Coverage: - DAY unit: Addition and subtraction with numeric values - MONTH unit: ADD_MONTHS function for addition and subtraction - YEAR unit: ADD_MONTHS with year-to-month conversion (*12) - Complex scenarios: WHERE clauses, mixed operations, multiple intervals Files: - test_oracle_19c_dialect.py: 14 comprehensive pytest tests (163 lines) - test_oracle_19c_standalone.py: 9 standalone tests (137 lines) Standalone tests can run without full app setup: poetry run python tests/custom_sqlglot/test_oracle_19c_standalone.py All tests passing (9/9 standalone validated). Related: SCAIS-23 P3.001
Major test infrastructure improvements: 1. Isolated conftest.py in tests/custom_sqlglot/ (no app dependencies) 2. Added pytest markers for Oracle 19c dialect tests 3. Created reusable test fixtures (transpile_trino_to_oracle, etc.) 4. Refactored date arithmetic tests to use fixtures 5. Renamed test file for clarity: test_oracle_19c_dialect.py → test_oracle_19c_date_arithmetic.py 6. Removed standalone test runner (no longer needed) New Type Mapping Tests (P3.002): - test_boolean_type_maps_to_char1: Core requirement validation - test_type_mapping_inheritance: Verify spread operator works - test_date_type_preserved: DATE type unchanged (if present) - test_timestamp_type_preserved: TIMESTAMP type unchanged (if present) - test_varchar_type_preserved: VARCHAR maps to VARCHAR2 - test_number_type_preserved: Numeric types unchanged - test_boolean_override_is_intentional: Document override behavior - test_oracle_specific_types: Parametrized tests for CHAR, NCHAR, VARCHAR, NVARCHAR Test Results: - 25 Oracle 19c tests passing (14 date arithmetic + 11 type mapping) - Tests run in 0.10s (previously couldn't run with pytest) - No app setup required (WREN_ENGINE_ENDPOINT not needed) Shell Helper: - Created ~/wren-test-oracle.sh for easy test execution - Usage: bash ~/wren-test-oracle.sh or wren-test-oracle (aliased) Related: SCAIS-23 P3.002
- Create test_oracle_19c_integration.py with 9 comprehensive tests - Validate full transpilation flow from Trino to Oracle 19c - Test date arithmetic, type mapping, and combined scenarios - Include realistic queries: JOINs, subqueries, CTEs, CASE expressions - Verify 19c syntax present and 21c+ syntax absent - All 34 Oracle 19c tests passing (14 date + 11 type + 9 integration) Implements: P3.003 - Integration Tests for Full Transpilation References: SCAIS-23, TEST-002, US-001, INT-001
- Create oracle_19c_dialect.md with complete usage guide - Update README.md with Oracle 19c testing instructions - Document all 34 tests (14 date + 11 type + 9 integration) - Include examples, validation, and known limitations - Add troubleshooting and support information Implements: P3.008 - Document Test Execution References: SCAIS-23
- Map TIMESTAMPTZ to TIMESTAMP in metadata extraction (oracle.py) - Oracle 19c doesn't support TIMESTAMP WITH TIME ZONE - Added test files for validation - Resolves ORA-01843 errors when querying TIMESTAMPTZ columns
WalkthroughIntroduces Oracle 19c database support to ibis-server through a custom SQLGlot dialect, backend patching to fix metadata schema queries, views-based metadata implementation, and Docker environment updates with Oracle Instant Client. Includes DSN connection options, type mapping overrides (BOOLEAN to CHAR(1), TIMESTAMPTZ to TIMESTAMP), custom CAST handling, and extensive test coverage. Changes
Sequence DiagramsequenceDiagram
participant Client as Client Request
participant Connector as SimpleConnector
participant Datasource as DataSource
participant Ibis as Ibis Oracle Backend
participant Patch as PatchedOracleBackend
participant Oracle as Oracle Database
Client->>Connector: query(sql)
Note over Connector: Logs SQL & limit
Connector->>Datasource: get_oracle_connection()
alt DSN Provided
Datasource->>Datasource: use_dsn_params
else No DSN
Datasource->>Datasource: use_host/port/database
end
Datasource->>Ibis: ibis.oracle.connect(params)
Ibis->>Patch: _get_schema_using_query(sql)
Note over Patch: Parse & transform query
Patch->>Oracle: CREATE TEMPORARY VIEW
Patch->>Oracle: Query all_tab_columns with CASE WHEN
Oracle-->>Patch: Column metadata
Patch->>Patch: Build Schema (with nullability)
Patch->>Oracle: DROP TEMPORARY VIEW
Patch-->>Ibis: Schema
Ibis-->>Connector: Connection/Execution Result
Connector-->>Client: Query Result
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Areas requiring extra attention:
Possibly related PRs
Suggested labels
Suggested reviewers
Poem
Pre-merge checks and finishing touches✅ Passed checks (3 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 15
🧹 Nitpick comments (20)
wren-core-legacy/docker/Dockerfile.build (1)
9-9: Prefer selective file copying over copying everything.Copying all files (
.) includes unnecessary artifacts like.git, IDE configurations, and build outputs, which increases image size and build context transfer time. Consider using a.dockerignorefile or copying only required directories (similar to Dockerfile.multistage's approach).🔎 Example of selective copying
-# Copy all source files -COPY . . +# Copy Maven wrapper and source files +COPY .mvn .mvn +COPY mvnw . +COPY pom.xml . +COPY wren-base ./wren-base +COPY wren-main ./wren-main +COPY wren-server ./wren-server +COPY wren-tests ./wren-tests +COPY trino-parser ./trino-parseribis-server/app/model/metadata/oracle.py (3)
3-3: Unnecessary import fromtypingin Python 3.11+.The
Listimport fromtypingis not needed. Python 3.11 supports built-in generic type hints likelist[Constraint].🔎 Proposed fix
-from typing import ListAnd update line 127:
- def get_constraints(self) -> List[Constraint]: + def get_constraints(self) -> list[Constraint]:
311-315: Moveimport reto module level.The
remodule is imported inside the function, which is executed for every column. While Python caches imports, placing it at the module level is more idiomatic.🔎 Proposed fix
Add at the top of the file with other imports:
import reThen remove line 314:
def _transform_column_type(self, data_type): # Strip precision/scale qualifiers from Oracle types before mapping # e.g., "TIMESTAMP(6)" -> "TIMESTAMP", "NUMBER(10,2)" -> "NUMBER" - import re normalized_type = re.sub(r'\([^)]*\)', '', data_type.upper()).strip()
275-286: Reasonable fallback, but consider logging the exception.The fallback to a hardcoded version is pragmatic for environments without
v$instancepermissions. Consider logging the exception for debugging purposes rather than silently swallowing it.🔎 Proposed enhancement
+from loguru import logger + except Exception: - # Fallback: Return hardcoded version for Oracle ADB 19c - # This ensures metadata discovery never fails on version check + # Fallback: Return hardcoded version for Oracle ADB 19c + logger.debug("Could not query v$instance, using default version") return "19.0.0.0.0"ibis-server/tests/custom_sqlglot/test_oracle_19c_date_arithmetic.py (3)
86-89: Consider more flexible assertion for generated SQL.The assertion checks for specific formatting patterns like
", 2 * 12)"with spacing variations. While this works, it's somewhat brittle to SQL formatting changes.Alternative approach using regex or AST validation
- assert "ADD_MONTHS(" in result.upper() - assert "* 12)" in result - assert ", 2 * 12)" in result or ", 2*12)" in result.replace(" ", "") - assert "INTERVAL" not in result.upper() + assert "ADD_MONTHS(" in result.upper() + # Use regex to match ADD_MONTHS with multiplication, ignoring whitespace + import re + assert re.search(r'ADD_MONTHS\([^,]+,\s*2\s*\*\s*12\)', result, re.IGNORECASE) + assert "INTERVAL" not in result.upper()Alternatively, parse the result and validate the AST structure rather than string matching.
96-99: Same brittle assertion pattern as year addition test.The multiple spacing variations (
"-(1 * 12)","-( 1 * 12 )", etc.) make the test fragile to formatting changes.Consider using regex pattern matching as suggested in the year addition test above.
170-172: Brittle assertion with multiple spacing variations.Similar to previous comments, the multiple string patterns for matching the multiplication syntax are fragile.
Apply the same regex-based approach suggested in earlier comments for consistency across the test suite.
ibis-server/app/model/connector.py (1)
89-89: Adjust dry_run logging level.Similar to the query method, this debug logging with emoji should be adjusted. If logging is needed, use
logger.debug()and remove the emoji.@tracer.start_as_current_span("connector_dry_run", kind=trace.SpanKind.CLIENT) def dry_run(self, sql: str) -> None: - logger.info(f"🔍 CONNECTOR DRY_RUN: {sql}") + logger.debug(f"Executing dry run") self.connection.sql(sql)ibis-server/test_default_oracle.py (1)
1-27: Relocate diagnostic script or convert to proper tests.This file appears to be an exploratory/diagnostic script rather than a proper test suite:
- No pytest structure (no test functions, no assertions)
- Uses print statements for manual inspection
- Located at repository root instead of tests/ directory
Recommendation: Choose one of these options:
- Convert to proper pytest tests (if validation is needed):
"""Test SQLGlot's default Oracle dialect behavior.""" import pytest import sqlglot @pytest.mark.oracle_dialect def test_default_oracle_interval_preservation(): """Verify default Oracle dialect preserves INTERVAL syntax.""" sql = "SELECT systimestamp + INTERVAL '1' DAY FROM dual" result = sqlglot.transpile(sql, read='oracle', write='oracle') assert "INTERVAL '1' DAY" in result[0] @pytest.mark.oracle_dialect def test_trino_to_oracle_transpilation(): """Verify Trino to Oracle transpilation.""" sql = "SELECT order_date + INTERVAL '1' DAY FROM orders" result = sqlglot.transpile(sql, read='trino', write='oracle') assert result[0] # At minimum, verify it doesn't error
- Move to scripts/ directory if this is meant for manual diagnostics:
mv test_default_oracle.py scripts/debug_oracle_dialect.py
- Remove if no longer needed for development.
ibis-server/test_interval_support.py (1)
1-37: Relocate or convert diagnostic script to proper tests.This diagnostic script has the same structural issues as
test_default_oracle.py:
- No pytest test structure or assertions
- Uses print statements for manual inspection
- Located at repository root
Apply the same remediation as suggested for
test_default_oracle.py:
- Convert to pytest tests under
tests/custom_sqlglot/:"""Tests for Oracle 19c INTERVAL support.""" import pytest import sqlglot from app.custom_sqlglot.dialects.oracle import Oracle @pytest.mark.oracle19c class TestOracleIntervalSupport: def test_default_dialect_preserves_interval(self): sql = "SELECT order_date + INTERVAL '1' DAY FROM orders" result = sqlglot.transpile(sql, read='trino', write='oracle') assert "INTERVAL" in result[0].upper() def test_custom_dialect_transforms_interval(self): sql = "SELECT order_date + INTERVAL '1' DAY FROM orders" result = sqlglot.transpile(sql, read='trino', write=Oracle) # Add appropriate assertions based on expected behavior assert result[0]
- Move to scripts/ if needed for debugging:
mv test_interval_support.py scripts/debug_interval_support.py
- Remove if obsolete.
ibis-server/test_timestamptz_issue.py (1)
1-66: Relocate or convert TIMESTAMPTZ diagnostic script.This is the third diagnostic script at the root level with similar issues:
- No pytest structure or assertions
- Manual execution required
- Try-except blocks with print output for inspection
Recommendation: Convert to proper tests under
tests/custom_sqlglot/:"""Tests for Oracle TIMESTAMPTZ handling.""" import pytest import sqlglot from app.custom_sqlglot.dialects.oracle import Oracle @pytest.mark.oracle19c class TestOracleTimestampTZ: def test_trino_to_oracle_timestamptz_cast(self): """Verify TIMESTAMPTZ CAST transpilation doesn't error.""" sql = "SELECT CAST(col AS TIMESTAMPTZ) FROM tbl" # Should not raise exception result = sqlglot.transpile(sql, read='trino', write=Oracle) assert result[0] # Add assertions for expected Oracle output def test_timestamptz_type_mapping(self): """Verify TIMESTAMPTZ maps to TIMESTAMP in Oracle 19c.""" sql = "SELECT CAST(col AS TIMESTAMPTZ) FROM tbl" result = sqlglot.transpile(sql, read='trino', write=Oracle) # Based on Oracle dialect TYPE_MAPPING, should convert to TIMESTAMP assert "TIMESTAMP" in result[0].upper() @pytest.mark.parametrize("cast_type", [ "TIMESTAMP", "TIMESTAMP WITH TIME ZONE", ]) def test_oracle_timestamp_variants(self, cast_type): """Test Oracle timestamp type handling.""" sql = f"SELECT CAST(col AS {cast_type}) FROM tbl" result = sqlglot.transpile(sql, read='oracle', write='oracle') assert result[0] # Should not errorIf these scripts are still useful for debugging, consolidate them into a single
scripts/debug_oracle_dialect.pyfile.ibis-server/test_oracle_views.py (2)
19-20: Avoidsys.pathmanipulation; consider relocating the test file.The
sys.path.insertis a code smell and may cause import issues. This file appears to be at the repository root (ibis-server/test_oracle_views.py) rather than intests/. Consider moving it to the proper test directory structure.
115-115: Testing private method_format_compact_table_namedirectly.Accessing private methods (prefixed with
_) in tests couples the test to implementation details. If the method is essential for testing, consider making it public or testing through the public API.ibis-server/docs/oracle_19c_dialect.md (1)
133-134: Remove user-specific file references from shared documentation.References to
~/.bashrcand~/wren-test-oracle.share user-specific and should not be in project documentation. These belong in a local setup guide or developer notes, not in the tracked docs.🔎 Suggested fix
### Configuration - ✅ `pyproject.toml` - Added pytest markers (oracle19c, dialect, type_mapping, date_arithmetic) -- ✅ `~/wren-test-oracle.sh` - Shell script for easy test execution -- ✅ `~/.bashrc` - Added aliases (wren-test-oracle, wren-poetry, wren-cd) + +For local convenience, you may create shell aliases for test execution (see Running Tests section above).ibis-server/tests/custom_sqlglot/test_oracle_pagination.py (1)
124-128: Assertion is fragile and has inconsistent case handling.The assertion
"- 7" in result or "ADD_MONTHS" in result.upper()mixes case-sensitive and case-insensitive checks. The space in"- 7"is also fragile as output formatting may vary.🔎 Suggested fix for consistency
# Should have both date arithmetic fix AND pagination assert "FETCH FIRST 20 ROWS ONLY" in result.upper() # Date arithmetic should use numeric subtraction (from P2.002-P2.006) - assert "- 7" in result or "ADD_MONTHS" in result.upper() + result_upper = result.upper() + assert "-7" in result.replace(" ", "") or "- 7" in result or "ADD_MONTHS" in result_upper, \ + f"Expected date arithmetic transformation, got: {result}" assert "LIMIT" not in result.upper()ibis-server/app/model/oracle_backend_patch.py (2)
68-78: Consider using parameterized query for metadata lookup.While
nameis generated internally viagen_name(), building SQL with f-strings is a pattern that could lead to injection if reused elsewhere. Using bind parameters would be safer.🔎 Suggested improvement using bind parameters
# Fixed version: Use raw SQL with CASE WHEN - metadata_sql = f""" + # Use bind parameter for table_name to follow safe SQL practices + metadata_sql = """ SELECT column_name, data_type, data_precision, data_scale, CASE WHEN nullable = 'Y' THEN 1 ELSE 0 END as is_nullable FROM all_tab_columns - WHERE table_name = '{name}' + WHERE table_name = :view_name ORDER BY column_id """ logger.debug(f"Querying metadata: {metadata_sql}") - results = con.execute(metadata_sql).fetchall() + results = con.execute(metadata_sql, {"view_name": name}).fetchall()
40-46: Complex lambda transformer is hard to read and maintain.The inline lambda with multiple conditions reduces readability. Consider extracting to a named function with clearer documentation.
🔎 Suggested refactor for clarity
+ def _quote_columns(node): + """Add quoting to Column nodes except in ORDER BY clauses.""" + if (isinstance(node, sg.exp.Column) + and not node.this.quoted + and not isinstance(node.parent, sg.exp.Order)): + return node.__class__(this=node.this, quoted=True) + return node + with self.begin() as con: # Parse and transform the query sg_expr = sg.parse_one(query, dialect=dialect) - # Apply ibis's transformer to add quoting - transformer = lambda node: ( - node.__class__(this=node.this, quoted=True) - if isinstance(node, sg.exp.Column) - and not node.this.quoted - and not isinstance(node.parent, sg.exp.Order) - else node - ) - sg_expr = sg_expr.transform(transformer) + # Apply transformer to add quoting + sg_expr = sg_expr.transform(_quote_columns)ibis-server/tests/custom_sqlglot/test_oracle_19c_type_mapping.py (1)
104-109: Preferpytest.skip()or logging overprint()for test diagnostics.Using
print()statements for informational output in tests is not ideal as pytest captures stdout by default. Consider usingpytest.warns(),logging, or simply removing informational prints.🔎 Suggested approach
# Our mapping should be CHAR(1) assert our_mapping == "CHAR(1)" - # Document if we're overriding base - # (this is informational, not a failure condition) - if base_mapping != "CHAR(1)": - print(f"INFO: Overriding base BOOLEAN mapping '{base_mapping}' with 'CHAR(1)'") + # The override from base is expected behavior, no action needed else: # Base doesn't have BOOLEAN, we're adding it assert oracle_type_mapping[exp.DataType.Type.BOOLEAN] == "CHAR(1)" - print("INFO: Adding BOOLEAN → CHAR(1) mapping (not in base)") + # Adding BOOLEAN mapping is expected when base lacks itibis-server/app/custom_sqlglot/dialects/oracle.py (1)
74-90: Pattern matching could be more robust; consider quote escaping.Two minor concerns:
Weak pattern validation (line 76): The check only validates dash positions but not that other characters are digits. A malformed string like
"abcd-ef-gh"would pass and produce invalid Oracle SQL instead of falling back to default CAST.Quote escaping (line 90): If
literal_valuecontains a single quote, the generated SQL would be malformed. While rare for date literals, defensive handling would be safer.🔎 Suggested improvements
- # Check if it matches YYYY-MM-DD pattern (with or without time) - # Pattern: YYYY-MM-DD or YYYY-MM-DD HH:MI:SS - if literal_value and len(literal_value) >= 10 and literal_value[4] == '-' and literal_value[7] == '-': + import re + # Validate YYYY-MM-DD pattern more strictly + date_pattern = r'^\d{4}-\d{2}-\d{2}$' + datetime_pattern = r'^\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2}$' + + if literal_value and (re.match(date_pattern, literal_value) or re.match(datetime_pattern, literal_value)): # Determine format based on length if len(literal_value) == 10: # Just date: YYYY-MM-DD format_mask = 'YYYY-MM-DD' - elif len(literal_value) == 19: + else: # len == 19, validated by regex # Date with time: YYYY-MM-DD HH:MI:SS format_mask = 'YYYY-MM-DD HH24:MI:SS' - else: - # Other length, use default CAST - return self.cast_sql(expression) # Use TO_TIMESTAMP or TO_DATE with explicit format func_name = "TO_TIMESTAMP" if target_type.this == exp.DataType.Type.TIMESTAMP else "TO_DATE" - return f"{func_name}('{literal_value}', '{format_mask}')" + # Escape single quotes in literal value + escaped_value = literal_value.replace("'", "''") + return f"{func_name}('{escaped_value}', '{format_mask}')"Note: Move the
import reto the top of the file if you apply this change.ibis-server/tests/custom_sqlglot/test_oracle_19c_integration.py (1)
36-74: Consider adding test coverage for CAST to TIMESTAMP/DATE handling.The
_handle_cast_oracle19cmethod in the dialect convertsCAST('2025-11-24' AS TIMESTAMP)toTO_TIMESTAMP('2025-11-24', 'YYYY-MM-DD')to fix ORA-01843 errors. This feature isn't exercised by any of the current integration tests.🔎 Suggested test to add
def test_cast_timestamp_with_string_literal(self, transpile_trino_to_oracle): """Test CAST to TIMESTAMP converts to TO_TIMESTAMP with explicit format.""" sql = "SELECT CAST('2025-11-24 14:30:00' AS TIMESTAMP) FROM dual" result = transpile_trino_to_oracle(sql) # Verify TO_TIMESTAMP with format mask is used assert "TO_TIMESTAMP(" in result.upper() assert "YYYY-MM-DD HH24:MI:SS" in result assert "CAST(" not in result.upper() def test_cast_date_with_string_literal(self, transpile_trino_to_oracle): """Test CAST to DATE converts to TO_DATE with explicit format.""" sql = "SELECT CAST('2025-11-24' AS DATE) FROM dual" result = transpile_trino_to_oracle(sql) # Verify TO_DATE with format mask is used assert "TO_DATE(" in result.upper() assert "YYYY-MM-DD" in result
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (23)
ibis-server/Dockerfile(3 hunks)ibis-server/README.md(1 hunks)ibis-server/app/custom_sqlglot/dialects/__init__.py(1 hunks)ibis-server/app/custom_sqlglot/dialects/oracle.py(1 hunks)ibis-server/app/model/__init__.py(2 hunks)ibis-server/app/model/connector.py(4 hunks)ibis-server/app/model/data_source.py(1 hunks)ibis-server/app/model/metadata/oracle.py(6 hunks)ibis-server/app/model/oracle_backend_patch.py(1 hunks)ibis-server/app/routers/v2/connector.py(1 hunks)ibis-server/docs/oracle_19c_dialect.md(1 hunks)ibis-server/pyproject.toml(1 hunks)ibis-server/test_default_oracle.py(1 hunks)ibis-server/test_interval_support.py(1 hunks)ibis-server/test_oracle_views.py(1 hunks)ibis-server/test_timestamptz_issue.py(1 hunks)ibis-server/tests/custom_sqlglot/conftest.py(1 hunks)ibis-server/tests/custom_sqlglot/test_oracle_19c_date_arithmetic.py(1 hunks)ibis-server/tests/custom_sqlglot/test_oracle_19c_integration.py(1 hunks)ibis-server/tests/custom_sqlglot/test_oracle_19c_type_mapping.py(1 hunks)ibis-server/tests/custom_sqlglot/test_oracle_pagination.py(1 hunks)wren-core-legacy/docker/Dockerfile.build(1 hunks)wren-core-legacy/docker/Dockerfile.multistage(1 hunks)
🧰 Additional context used
🧬 Code graph analysis (5)
ibis-server/tests/custom_sqlglot/test_oracle_19c_date_arithmetic.py (2)
ibis-server/app/custom_sqlglot/dialects/oracle.py (1)
Oracle(6-93)ibis-server/tests/custom_sqlglot/conftest.py (1)
transpile_trino_to_oracle(40-55)
ibis-server/app/model/connector.py (1)
ibis-server/app/model/oracle_backend_patch.py (1)
patch_oracle_backend(105-118)
ibis-server/test_timestamptz_issue.py (1)
ibis-server/app/custom_sqlglot/dialects/oracle.py (1)
Oracle(6-93)
ibis-server/tests/custom_sqlglot/test_oracle_19c_integration.py (2)
ibis-server/app/custom_sqlglot/dialects/oracle.py (1)
Oracle(6-93)ibis-server/tests/custom_sqlglot/conftest.py (3)
oracle_dialect(15-17)transpile_trino_to_oracle(40-55)oracle_type_mapping(27-29)
ibis-server/tests/custom_sqlglot/test_oracle_pagination.py (1)
ibis-server/app/custom_sqlglot/dialects/oracle.py (1)
Oracle(6-93)
🪛 markdownlint-cli2 (0.18.1)
ibis-server/docs/oracle_19c_dialect.md
74-74: Emphasis used instead of a heading
(MD036, no-emphasis-as-heading)
167-167: Strong style
Expected: asterisk; Actual: underscore
(MD050, strong-style)
167-167: Strong style
Expected: asterisk; Actual: underscore
(MD050, strong-style)
🔇 Additional comments (14)
ibis-server/pyproject.toml (1)
71-74: LGTM!The new pytest markers for Oracle 19c testing are well-defined and follow the existing naming conventions. They provide clear descriptions of their purpose.
ibis-server/Dockerfile (1)
67-71: LGTM!The environment variables for Oracle Instant Client are correctly configured.
LD_LIBRARY_PATHproperly appends to existing values, andORACLE_HOMEpoints to the correct installation directory.ibis-server/app/custom_sqlglot/dialects/__init__.py (1)
3-4: LGTM - Clean dialect export.The Oracle dialect import follows the existing pattern established by the MySQL import and properly exposes the custom Oracle dialect for use throughout the application.
ibis-server/tests/custom_sqlglot/test_oracle_19c_date_arithmetic.py (1)
1-172: Well-structured test suite with comprehensive coverage.The test module provides excellent coverage of Oracle 19c date arithmetic transformations with:
- Clear test organization and naming
- Good use of pytest markers for selective test execution
- Thorough edge case testing (month-end dates, WHERE clauses, mixed operations)
- Helpful docstrings
The string assertion brittleness noted above is acceptable for dialect testing, though improvements would enhance maintainability.
ibis-server/app/model/connector.py (1)
37-39: Verify module-level patching approach.Applying the Oracle backend patch at module import time creates a global side effect. Ensure this is intentional and won't cause issues when the module is imported in different contexts (e.g., testing, multiple connection types).
Consider whether the patch should be applied lazily when an Oracle connection is first created, or if there's a benefit to applying it globally at startup.
ibis-server/README.md (1)
84-98: Well-structured Oracle 19c documentation section.The new documentation clearly explains:
- Purpose of the custom Oracle 19c dialect
- Link to detailed documentation
- How to run the test suite
Good practice providing both an alias shortcut and the full command.
ibis-server/tests/custom_sqlglot/conftest.py (1)
1-55: Excellent test fixture implementation.This conftest provides well-structured fixtures for Oracle dialect testing:
- Clear separation between dialect, generator, and type mapping fixtures
- Good documentation with usage examples
- The
transpile_trino_to_oraclehelper fixture elegantly encapsulates transpilation logic- Proper isolation from application-level configuration
The fixtures support comprehensive testing while maintaining clean test code.
ibis-server/tests/custom_sqlglot/test_oracle_pagination.py (1)
13-97: Solid test coverage for Oracle pagination syntax.The test class covers various pagination scenarios including LIMIT, OFFSET, JOINs, WHERE clauses, and edge cases. Good use of descriptive test names and docstrings.
ibis-server/tests/custom_sqlglot/test_oracle_19c_type_mapping.py (2)
111-126: Well-structured parametrized test for Oracle-specific types.Good use of
@pytest.mark.parametrizeto test multiple type mappings with a single test function. The conditional assertion handles cases where types may not be present in the mapping.
16-29: Core type mapping tests look correct.The tests properly verify the BOOLEAN → CHAR(1) mapping for Oracle 19c compatibility and use appropriate pytest markers for categorization.
ibis-server/app/model/oracle_backend_patch.py (1)
105-118: Monkey-patching approach is properly documented and timely.The patch is called at module import time during router initialization, well before any Oracle connections are created. Implementation includes docstring, inline comments, and logging. This approach is justified for fixing the upstream ibis ORA-00923 bug and preserves the original backend reference for potential restoration.
ibis-server/app/custom_sqlglot/dialects/oracle.py (1)
1-42: Well-structured dialect extension with clear documentation.The class design properly extends SQLGlot's Oracle dialect, and the TYPE_MAPPING overrides for BOOLEAN→CHAR(1) and TIMESTAMPTZ→TIMESTAMP are appropriate for Oracle 19c compatibility. The docstrings clearly explain the rationale for each override.
ibis-server/tests/custom_sqlglot/test_oracle_19c_integration.py (2)
1-21: Comprehensive integration test suite with good coverage of date arithmetic.The test class is well-organized with clear docstrings explaining each test scenario. The coverage of date arithmetic in various SQL constructs (subqueries, CTEs, CASE expressions) is thorough.
76-248: Thorough test coverage for complex SQL patterns.The remaining tests effectively validate Oracle 19c compatibility across a range of realistic scenarios including JOINs, aggregations, GROUP BY/HAVING, nested expressions, subqueries, and CTEs. The assertions appropriately verify both the presence of Oracle-compatible syntax and the absence of unsupported constructs.
| host: SecretStr | None = Field( | ||
| examples=["localhost"], description="the hostname of your database", default=None | ||
| ) | ||
| port: SecretStr = Field(examples=[1521], description="the port of your database") | ||
| database: SecretStr = Field( | ||
| examples=["orcl"], description="the database name of your database" | ||
| port: SecretStr | None = Field(examples=[1521], description="the port of your database", default=None) | ||
| database: SecretStr | None = Field( | ||
| examples=["orcl"], description="the database name of your database", default=None | ||
| ) | ||
| user: SecretStr = Field( | ||
| examples=["admin"], description="the username of your database" | ||
| ) | ||
| password: SecretStr | None = Field( | ||
| examples=["password"], description="the password of your database", default=None | ||
| ) | ||
| dsn: SecretStr | None = Field( | ||
| examples=["(DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=host)(PORT=port))(CONNECT_DATA=(SERVICE_NAME=service)))"], | ||
| description="Oracle Data Source Name (DSN) - Alternative to host/port/database configuration", | ||
| default=None | ||
| ) |
There was a problem hiding this comment.
Missing validation: either DSN or host/port/database must be provided.
The current model allows all connection parameters (dsn, host, port, database) to be None simultaneously, which would cause AttributeError when accessing .get_secret_value() on None in data_source.py.
Consider adding a Pydantic model_validator to ensure at least one valid connection method is configured:
🔎 Proposed fix using Pydantic model validator
+from pydantic import model_validator
class OracleConnectionInfo(BaseConnectionInfo):
host: SecretStr | None = Field(
examples=["localhost"], description="the hostname of your database", default=None
)
port: SecretStr | None = Field(examples=[1521], description="the port of your database", default=None)
database: SecretStr | None = Field(
examples=["orcl"], description="the database name of your database", default=None
)
user: SecretStr = Field(
examples=["admin"], description="the username of your database"
)
password: SecretStr | None = Field(
examples=["password"], description="the password of your database", default=None
)
dsn: SecretStr | None = Field(
examples=["(DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=host)(PORT=port))(CONNECT_DATA=(SERVICE_NAME=service)))"],
description="Oracle Data Source Name (DSN) - Alternative to host/port/database configuration",
default=None
)
+
+ @model_validator(mode='after')
+ def validate_connection_params(self):
+ if not self.dsn and not (self.host and self.port and self.database):
+ raise ValueError("Either 'dsn' or all of 'host', 'port', and 'database' must be provided")
+ return selfCommittable suggestion skipped: line range outside the PR's diff.
🤖 Prompt for AI Agents
In ibis-server/app/model/__init__.py around lines 206 to 223, the model
currently allows dsn, host, port and database to all be None which later causes
AttributeError when calling .get_secret_value(); add a Pydantic model_validator
(root/model-level validator) that checks: if dsn is not provided (None or empty)
then host, port and database must all be provided and non-null (and optionally
non-empty); if the check fails raise a ValueError with a clear message like
"Either dsn or host, port and database must be provided". Ensure the validator
runs after field parsing and returns the validated values unchanged when valid.
| import sys | ||
| print(f"🔍 CONNECTOR QUERY: {sql}", file=sys.stderr, flush=True) | ||
| print(f"🔍 LIMIT: {limit}", file=sys.stderr, flush=True) | ||
| logger.info(f"🔍 CONNECTOR QUERY: {sql}") | ||
| logger.info(f"🔍 LIMIT: {limit}") |
There was a problem hiding this comment.
Remove temporary debug logging before merging.
These debug print statements appear to be temporary debugging code that should be removed before production:
- Redundant logging (both
printto stderr ANDlogger.info) - Uses emoji (🔍) which may not render correctly in all environments
- Verbose logging for every query could impact performance and clutter logs
Suggested cleanup
@tracer.start_as_current_span("connector_query", kind=trace.SpanKind.CLIENT)
def query(self, sql: str, limit: int) -> pd.DataFrame:
- import sys
- print(f"🔍 CONNECTOR QUERY: {sql}", file=sys.stderr, flush=True)
- print(f"🔍 LIMIT: {limit}", file=sys.stderr, flush=True)
- logger.info(f"🔍 CONNECTOR QUERY: {sql}")
- logger.info(f"🔍 LIMIT: {limit}")
+ logger.debug(f"Executing query with limit {limit}")
return self.connection.sql(sql).limit(limit).to_pandas()If query logging is needed for debugging, consider:
- Using
logger.debug()instead oflogger.info()to avoid cluttering production logs - Removing the emoji
- Removing redundant stderr prints
- Only logging the SQL at debug level if truly needed
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| import sys | |
| print(f"🔍 CONNECTOR QUERY: {sql}", file=sys.stderr, flush=True) | |
| print(f"🔍 LIMIT: {limit}", file=sys.stderr, flush=True) | |
| logger.info(f"🔍 CONNECTOR QUERY: {sql}") | |
| logger.info(f"🔍 LIMIT: {limit}") | |
| logger.debug(f"Executing query with limit {limit}") |
🤖 Prompt for AI Agents
In ibis-server/app/model/connector.py around lines 80 to 84, remove the
temporary debug prints to stderr and the emoji-laden info logs; replace them
with an optional debug-level logger call without emojis and without duplicate
printing. Specifically, delete the two print(..., file=sys.stderr) lines, change
logger.info to logger.debug, remove the emoji from the message, and only log the
SQL and limit at debug level (or remove entirely if not needed) so production
logs aren’t cluttered and performance isn’t impacted.
| else: | ||
| # Otherwise use individual host/port/database parameters | ||
| connect_params.update({ | ||
| "host": info.host.get_secret_value(), | ||
| "port": int(info.port.get_secret_value()), | ||
| "database": info.database.get_secret_value(), | ||
| }) |
There was a problem hiding this comment.
Potential AttributeError when host, port, or database are None.
When dsn is not provided, the code assumes host, port, and database are valid SecretStr objects. However, since these fields are now optional (SecretStr | None) in OracleConnectionInfo, calling .get_secret_value() on None will raise an AttributeError.
🔎 Proposed fix with explicit validation
# If DSN is provided, use it directly
if info.dsn:
connect_params["dsn"] = info.dsn.get_secret_value()
else:
# Otherwise use individual host/port/database parameters
+ if not info.host or not info.port or not info.database:
+ raise ValueError(
+ "Either 'dsn' or all of 'host', 'port', and 'database' must be provided"
+ )
connect_params.update({
"host": info.host.get_secret_value(),
"port": int(info.port.get_secret_value()),
"database": info.database.get_secret_value(),
})📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| else: | |
| # Otherwise use individual host/port/database parameters | |
| connect_params.update({ | |
| "host": info.host.get_secret_value(), | |
| "port": int(info.port.get_secret_value()), | |
| "database": info.database.get_secret_value(), | |
| }) | |
| else: | |
| # Otherwise use individual host/port/database parameters | |
| if not info.host or not info.port or not info.database: | |
| raise ValueError( | |
| "Either 'dsn' or all of 'host', 'port', and 'database' must be provided" | |
| ) | |
| connect_params.update({ | |
| "host": info.host.get_secret_value(), | |
| "port": int(info.port.get_secret_value()), | |
| "database": info.database.get_secret_value(), | |
| }) |
🤖 Prompt for AI Agents
In ibis-server/app/model/data_source.py around lines 193 to 199, the code calls
.get_secret_value() on info.host/info.port/info.database but those fields are
now Optional (SecretStr | None) and can be None; update the branch that builds
connect_params to first validate that host, port, and database are not None,
raising a clear ValueError (or custom exception) if any are missing, then call
.get_secret_value() and convert port to int; ensure the error message names the
missing field(s) so callers know which parameter is absent.
| sql = f""" | ||
| SELECT | ||
| t.owner AS TABLE_CATALOG, | ||
| t.owner AS TABLE_SCHEMA, | ||
| t.table_name AS TABLE_NAME, | ||
| v.owner AS TABLE_CATALOG, | ||
| v.owner AS TABLE_SCHEMA, | ||
| v.view_name AS TABLE_NAME, | ||
| c.column_name AS COLUMN_NAME, | ||
| c.data_type AS DATA_TYPE, | ||
| c.nullable AS IS_NULLABLE, | ||
| c.column_id AS ORDINAL_POSITION, | ||
| tc.comments AS TABLE_COMMENT, | ||
| cc.comments AS COLUMN_COMMENT | ||
| FROM | ||
| all_tables t | ||
| all_views v | ||
| JOIN | ||
| all_tab_columns c | ||
| ON t.owner = c.owner | ||
| AND t.table_name = c.table_name | ||
| ON v.owner = c.owner | ||
| AND v.view_name = c.table_name | ||
| LEFT JOIN | ||
| all_tab_comments tc | ||
| ON tc.owner = t.owner | ||
| AND tc.table_name = t.table_name | ||
| ON tc.owner = v.owner | ||
| AND tc.table_name = v.view_name | ||
| LEFT JOIN | ||
| all_col_comments cc | ||
| ON cc.owner = c.owner | ||
| AND cc.table_name = c.table_name | ||
| AND cc.column_name = c.column_name | ||
| WHERE | ||
| t.owner = 'SYSTEM' | ||
| v.owner = '{user}' | ||
| ORDER BY | ||
| t.table_name, c.column_id; | ||
| v.view_name, c.column_id; | ||
| """ |
There was a problem hiding this comment.
SQL injection risk with direct string interpolation.
The user variable is interpolated directly into the SQL query without sanitization. While this value comes from connection configuration, it still poses a SQL injection risk if the configuration is compromised or contains special characters.
Consider sanitizing the user value or using Oracle identifier quoting:
🔎 Proposed fix
# Get dynamic user from connection info (not hardcoded 'SYSTEM')
- user = self.connection_info.user.get_secret_value()
+ user = self.connection_info.user.get_secret_value().upper()
+ # Validate user contains only safe characters (alphanumeric, underscore)
+ import re
+ if not re.match(r'^[A-Z_][A-Z0-9_$#]*$', user):
+ raise ValueError(f"Invalid Oracle user identifier: {user}")
sql = f"""Committable suggestion skipped: line range outside the PR's diff.
🤖 Prompt for AI Agents
In ibis-server/app/model/metadata/oracle.py around lines 40 to 70, the SQL
builds v.owner = '{user}' by direct string interpolation which creates a SQL
injection risk; change this to use a bound parameter or strict validation:
replace the inline '{user}' with a bind placeholder (e.g. :user) and pass the
connection bind params (user value) when executing the query, or if a bind is
not possible for this context, validate/normalize the user against a safe
pattern (e.g. /^[A-Z0-9_]+$/ and upper-case it) before interpolation to ensure
only valid Oracle identifier characters are used. Ensure any change keeps the
same behavior (owner comparison) and remove raw string interpolation.
| def get_constraints(self) -> List[Constraint]: | ||
| """ | ||
| res = ( | ||
| self.connection.sql(sql, schema=schema) | ||
| .to_pandas() | ||
| .to_dict(orient="records") | ||
| ) | ||
| Auto-detect view-to-view relationships using column naming patterns. | ||
|
|
||
| constraints = [] | ||
| for row in res: | ||
| constraints.append( | ||
| Constraint( | ||
| constraintName=self._format_constraint_name( | ||
| row["TABLE_NAME"], | ||
| row["COLUMN_NAME"], | ||
| row["REFERENCED_TABLE_NAME"], | ||
| row["REFERENCED_COLUMN_NAME"], | ||
| ), | ||
| constraintTable=self._format_compact_table_name( | ||
| row["TABLE_SCHEMA"], row["TABLE_NAME"] | ||
| ), | ||
| constraintColumn=row["COLUMN_NAME"], | ||
| constraintedTable=self._format_compact_table_name( | ||
| row["REFERENCED_TABLE_SCHEMA"], row["REFERENCED_TABLE_NAME"] | ||
| ), | ||
| constraintedColumn=row["REFERENCED_COLUMN_NAME"], | ||
| constraintType=ConstraintType.FOREIGN_KEY, | ||
| ) | ||
| ) | ||
| return constraints | ||
| TEMPORARILY DISABLED: Returning empty list to unblock development. | ||
| The column query against all_tab_columns appears to hang on large view sets. | ||
|
|
||
| TODO: Implement alternative approach: | ||
| - Use table metadata already retrieved in get_table_list() | ||
| - Cache column information to avoid repeated queries | ||
| - Or implement manual relationship definition UI | ||
|
|
||
| Returns: | ||
| Empty list (relationships disabled for now) | ||
| """ | ||
| return [] | ||
|
|
||
| # ORIGINAL CODE COMMENTED OUT - WAS HANGING ON COLUMN QUERY | ||
| # constraints = [] | ||
| # try: | ||
| # print(f"🔍 Starting constraint detection for user {user}...") | ||
| # | ||
| # # Step 1: Get list of views first (fast query) | ||
| # views_schema = ibis.schema({"VIEW_NAME": "string"}) | ||
| # views_sql = f""" | ||
| # SELECT view_name AS VIEW_NAME | ||
| # FROM all_views | ||
| # WHERE owner = '{user}' | ||
| # """ | ||
| # views_df = self.connection.sql(views_sql, schema=views_schema).to_pandas() | ||
| # view_list = views_df['VIEW_NAME'].tolist() | ||
| # | ||
| # print(f"🔍 Found {len(view_list)} views") | ||
| # | ||
| # if not view_list: | ||
| # return [] | ||
| # | ||
| # # Step 2: Get columns for those views (single query with explicit list) | ||
| # columns_schema = ibis.schema({ | ||
| # "TABLE_NAME": "string", | ||
| # "COLUMN_NAME": "string", | ||
| # "COLUMN_ID": "int64" | ||
| # }) | ||
| # | ||
| # # Use IN clause with explicit list instead of subquery | ||
| # view_list_str = "', '".join(view_list) | ||
| # columns_sql = f""" | ||
| # SELECT | ||
| # table_name AS TABLE_NAME, | ||
| # column_name AS COLUMN_NAME, | ||
| # column_id AS COLUMN_ID | ||
| # FROM all_tab_columns | ||
| # WHERE owner = '{user}' | ||
| # AND table_name IN ('{view_list_str}') | ||
| # ORDER BY table_name, column_id | ||
| # """ | ||
| # | ||
| # print(f"🔍 Querying columns for {len(view_list)} views...") | ||
| # columns_df = self.connection.sql(columns_sql, schema=columns_schema).to_pandas() | ||
| # print(f"🔍 Retrieved {len(columns_df)} total columns") | ||
| # | ||
| # if columns_df.empty: | ||
| # return [] | ||
| # | ||
| # # Build index: view_name -> list of column names | ||
| # view_columns = {} | ||
| # for _, row in columns_df.iterrows(): | ||
| # view = row['TABLE_NAME'] | ||
| # col = row['COLUMN_NAME'] | ||
| # if view not in view_columns: | ||
| # view_columns[view] = [] | ||
| # view_columns[view].append(col) | ||
| # | ||
| # # Identify primary key columns per view | ||
| # # Pattern: column name ends with "Primary Key" | ||
| # # Build entity -> views map (which views have this entity as PK) | ||
| # entity_to_views = {} # entity name -> list of (view, pk_column) | ||
| # view_primary_keys = {} # view -> list of PK columns | ||
| # | ||
| # print(f"🔍 Analyzing {len(view_columns)} views for Primary Key columns...") | ||
| # | ||
| # for view, columns in view_columns.items(): | ||
| # pks = [col for col in columns if col.endswith('Primary Key')] | ||
| # if pks: | ||
| # view_primary_keys[view] = pks | ||
| # # Map each entity to the views that have it as PK | ||
| # for pk in pks: | ||
| # entity = pk.replace(' Primary Key', '') | ||
| # if entity not in entity_to_views: | ||
| # entity_to_views[entity] = [] | ||
| # entity_to_views[entity].append((view, pk)) | ||
| # | ||
| # print(f"🔍 Found {len(view_primary_keys)} views with Primary Key columns") | ||
| # print(f"🔍 Identified {len(entity_to_views)} distinct entities: {list(entity_to_views.keys())[:10]}...") | ||
| # | ||
| # # Detect relationships by matching column names | ||
| # for source_view, source_cols in view_columns.items(): | ||
| # # Find foreign key candidates (columns ending with "Primary Key") | ||
| # fk_candidates = [col for col in source_cols if col.endswith('Primary Key')] | ||
| # | ||
| # for fk_col in fk_candidates: | ||
| # # Extract entity name from FK column | ||
| # entity = fk_col.replace(' Primary Key', '') | ||
| # | ||
| # # Find views that have this entity as their primary key | ||
| # if entity in entity_to_views: | ||
| # for target_view, target_pk_col in entity_to_views[entity]: | ||
| # if source_view == target_view: | ||
| # # Skip self-references (e.g., Part Primary Key in RT Parts table itself) | ||
| # continue | ||
| # | ||
| # # Create relationship! | ||
| # constraint_name = f"FK_{source_view}_{target_view}_{entity}".replace(" ", "_") | ||
| # | ||
| # constraints.append( | ||
| # Constraint( | ||
| # constraintName=constraint_name[:128], # Oracle name limit | ||
| # constraintTable=self._format_compact_table_name(user, source_view), | ||
| # constraintColumn=fk_col, | ||
| # constraintedTable=self._format_compact_table_name(user, target_view), | ||
| # constraintedColumn=target_pk_col, | ||
| # constraintType=ConstraintType.FOREIGN_KEY, | ||
| # ) | ||
| # ) | ||
| # | ||
| # print(f"✅ Detected {len(constraints)} view relationships using column naming patterns") | ||
| # if constraints: | ||
| # print(f"📊 Sample relationships:") | ||
| # for c in constraints[:5]: | ||
| # print(f" - {c.constraintTable}.{c.constraintColumn} -> {c.constraintedTable}.{c.constraintedColumn}") | ||
| # return constraints | ||
| # | ||
| # except Exception as e: | ||
| # # If auto-detection fails, return empty list rather than breaking WrenAI | ||
| # # This ensures the system remains functional even with permission issues | ||
| # print(f"Warning: Could not auto-detect view relationships: {e}") | ||
| # return [] |
There was a problem hiding this comment.
🛠️ Refactor suggestion | 🟠 Major
Remove large block of commented-out code.
The commented-out implementation spans ~100 lines and creates maintenance burden. Consider removing it and tracking the TODO in a GitHub issue for proper visibility and tracking.
🔎 Proposed refactor
def get_constraints(self) -> list[Constraint]:
"""
- Auto-detect view-to-view relationships using column naming patterns.
-
- TEMPORARILY DISABLED: Returning empty list to unblock development.
- The column query against all_tab_columns appears to hang on large view sets.
-
- TODO: Implement alternative approach:
- - Use table metadata already retrieved in get_table_list()
- - Cache column information to avoid repeated queries
- - Or implement manual relationship definition UI
-
- Returns:
- Empty list (relationships disabled for now)
+ Return empty list - constraint auto-detection temporarily disabled.
+
+ See: https://github.com/Canner/wren-engine/issues/XXX
"""
return []
-
- # ORIGINAL CODE COMMENTED OUT - WAS HANGING ON COLUMN QUERY
- # ... (remove all ~100 lines of commented code)Would you like me to open a GitHub issue to track the constraint auto-detection feature?
Committable suggestion skipped: line range outside the PR's diff.
🤖 Prompt for AI Agents
In ibis-server/app/model/metadata/oracle.py around lines 127-263 there is a
~100-line commented-out implementation of get_constraints that should be removed
to reduce maintenance clutter; delete the entire commented block (everything
after the current early return) and replace it with a single-line TODO comment
that references a new GitHub issue tracking re-implementation (create the issue
first, then put its number or URL in the TODO), update the method docstring to
mention "see TODO/issue #<id>" for follow-up, and add a concise git commit
message like "chore: remove large commented-out get_constraints implementation;
track work in issue #<id>".
| print("📋 Test 4: Verify views-only (no tables)") | ||
| print("-" * 80) | ||
| # Check if any discovered objects are tables (they shouldn't be) | ||
| view_count = len(views) | ||
| print(f"✅ All {view_count} objects are views") | ||
| print(f" (No tables discovered - views-only architecture confirmed)") | ||
| print() |
There was a problem hiding this comment.
Test 4 doesn't actually verify views-only behavior.
The test claims to verify that "all objects are views (no tables)" but it only checks the count of views returned from get_table_list(). There's no actual validation that these are views and not tables.
🔎 Consider adding actual verification
print("📋 Test 4: Verify views-only (no tables)")
print("-" * 80)
- # Check if any discovered objects are tables (they shouldn't be)
- view_count = len(views)
- print(f"✅ All {view_count} objects are views")
- print(f" (No tables discovered - views-only architecture confirmed)")
+ # Verify returned objects have view-like properties or query the database
+ # to confirm object types if metadata is available
+ view_count = len(views)
+ # TODO: Add actual verification that objects are views, not tables
+ print(f"✅ Discovered {view_count} objects (assuming views based on query source)")
+ print(f" (Views-only query used in get_table_list)")
print()🤖 Prompt for AI Agents
In ibis-server/test_oracle_views.py around lines 87 to 93, the test only checks
the length of views but does not assert that each discovered object is actually
a VIEW rather than a TABLE; update the test to iterate the returned objects and
validate their type: either use the metadata returned by get_table_list (e.g., a
'type' or 'kind' field) and assert it equals 'VIEW' for every entry, or call the
DB inspection API/SQL (e.g., SQLAlchemy inspector.get_view_names() or querying
information_schema/tables WHERE table_type='VIEW' for Oracle-equivalent) and
fail the test if any table is found; adjust the printed messages to reflect the
actual verification and raise an assertion/error when a non-view is detected.
| RUN \ | ||
| apt update && \ | ||
| apt -y install curl gpg lsb-release && \ | ||
| curl -fsSL https://www.postgresql.org/media/keys/ACCC4CF8.asc | gpg --dearmor -o /etc/apt/trusted.gpg.d/postgresql.gpg && \ | ||
| echo "deb http://apt.postgresql.org/pub/repos/apt/ `lsb_release -cs`-pgdg main" | tee /etc/apt/sources.list.d/pgdg.list && \ | ||
| apt update && \ | ||
| apt -y install postgresql-client-13 |
There was a problem hiding this comment.
Missing Oracle Instant Client installation for Oracle ADB integration.
This PR introduces Oracle 19c database support, but neither the builder nor runtime stages install the Oracle Instant Client libraries required for Oracle connectivity. The AI summary explicitly mentions "Docker environment updates with Oracle Instant Client" as a key feature. Without these libraries, the application cannot connect to Oracle databases.
🔎 Proposed fix to install Oracle Instant Client
RUN \
apt update && \
apt -y install curl gpg lsb-release && \
curl -fsSL https://www.postgresql.org/media/keys/ACCC4CF8.asc | gpg --dearmor -o /etc/apt/trusted.gpg.d/postgresql.gpg && \
echo "deb http://apt.postgresql.org/pub/repos/apt/ `lsb_release -cs`-pgdg main" | tee /etc/apt/sources.list.d/pgdg.list && \
apt update && \
- apt -y install postgresql-client-13
+ apt -y install postgresql-client-13 && \
+ apt -y install wget unzip libaio1 && \
+ wget https://download.oracle.com/otn_software/linux/instantclient/1923000/instantclient-basic-linux.x64-19.23.0.0.0dbru.zip -O /tmp/oracle.zip && \
+ unzip /tmp/oracle.zip -d /opt/oracle && \
+ rm /tmp/oracle.zip && \
+ sh -c "echo /opt/oracle/instantclient_19_23 > /etc/ld.so.conf.d/oracle-instantclient.conf" && \
+ ldconfigAlternatively, you may need to set environment variables:
ENV LD_LIBRARY_PATH=/opt/oracle/instantclient_19_23:$LD_LIBRARY_PATH
ENV PATH=/opt/oracle/instantclient_19_23:$PATH📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| RUN \ | |
| apt update && \ | |
| apt -y install curl gpg lsb-release && \ | |
| curl -fsSL https://www.postgresql.org/media/keys/ACCC4CF8.asc | gpg --dearmor -o /etc/apt/trusted.gpg.d/postgresql.gpg && \ | |
| echo "deb http://apt.postgresql.org/pub/repos/apt/ `lsb_release -cs`-pgdg main" | tee /etc/apt/sources.list.d/pgdg.list && \ | |
| apt update && \ | |
| apt -y install postgresql-client-13 | |
| RUN \ | |
| apt update && \ | |
| apt -y install curl gpg lsb-release && \ | |
| curl -fsSL https://www.postgresql.org/media/keys/ACCC4CF8.asc | gpg --dearmor -o /etc/apt/trusted.gpg.d/postgresql.gpg && \ | |
| echo "deb http://apt.postgresql.org/pub/repos/apt/ `lsb_release -cs`-pgdg main" | tee /etc/apt/sources.list.d/pgdg.list && \ | |
| apt update && \ | |
| apt -y install postgresql-client-13 && \ | |
| apt -y install wget unzip libaio1 && \ | |
| wget https://download.oracle.com/otn_software/linux/instantclient/1923000/instantclient-basic-linux.x64-19.23.0.0.0dbru.zip -O /tmp/oracle.zip && \ | |
| unzip /tmp/oracle.zip -d /opt/oracle && \ | |
| rm /tmp/oracle.zip && \ | |
| sh -c "echo /opt/oracle/instantclient_19_23 > /etc/ld.so.conf.d/oracle-instantclient.conf" && \ | |
| ldconfig |
| COPY ../wren-core-legacy/.mvn .mvn | ||
| COPY ../wren-core-legacy/mvnw . | ||
| COPY ../wren-core-legacy/pom.xml . | ||
| COPY ../wren-core-legacy/wren-base ./wren-base | ||
| COPY ../wren-core-legacy/wren-main ./wren-main | ||
| COPY ../wren-core-legacy/wren-server ./wren-server | ||
| COPY ../wren-core-legacy/wren-tests ./wren-tests | ||
| COPY ../wren-core-legacy/trino-parser ./trino-parser |
There was a problem hiding this comment.
COPY paths assume incorrect parent directory context.
The COPY commands reference ../wren-core-legacy/, which assumes the Docker build context is set to the parent directory of wren-core-legacy. However, based on standard Docker practices and the location of this Dockerfile, the build context is likely set to the wren-core-legacy directory itself, which would cause all these COPY operations to fail.
🔎 Proposed fix to correct COPY paths
# Copy Maven wrapper and pom files
-COPY ../wren-core-legacy/.mvn .mvn
-COPY ../wren-core-legacy/mvnw .
-COPY ../wren-core-legacy/pom.xml .
-COPY ../wren-core-legacy/wren-base ./wren-base
-COPY ../wren-core-legacy/wren-main ./wren-main
-COPY ../wren-core-legacy/wren-server ./wren-server
-COPY ../wren-core-legacy/wren-tests ./wren-tests
-COPY ../wren-core-legacy/trino-parser ./trino-parser
+COPY .mvn .mvn
+COPY mvnw .
+COPY pom.xml .
+COPY wren-base ./wren-base
+COPY wren-main ./wren-main
+COPY wren-server ./wren-server
+COPY wren-tests ./wren-tests
+COPY trino-parser ./trino-parser📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| COPY ../wren-core-legacy/.mvn .mvn | |
| COPY ../wren-core-legacy/mvnw . | |
| COPY ../wren-core-legacy/pom.xml . | |
| COPY ../wren-core-legacy/wren-base ./wren-base | |
| COPY ../wren-core-legacy/wren-main ./wren-main | |
| COPY ../wren-core-legacy/wren-server ./wren-server | |
| COPY ../wren-core-legacy/wren-tests ./wren-tests | |
| COPY ../wren-core-legacy/trino-parser ./trino-parser | |
| COPY .mvn .mvn | |
| COPY mvnw . | |
| COPY pom.xml . | |
| COPY wren-base ./wren-base | |
| COPY wren-main ./wren-main | |
| COPY wren-server ./wren-server | |
| COPY wren-tests ./wren-tests | |
| COPY trino-parser ./trino-parser |
🤖 Prompt for AI Agents
In wren-core-legacy/docker/Dockerfile.multistage around lines 7 to 14, the COPY
instructions use ../wren-core-legacy/* which assumes build context is the parent
directory; update each COPY to use paths relative to the current build context
(remove the leading ../wren-core-legacy/), e.g. COPY .mvn .mvn, COPY mvnw .,
COPY pom.xml ., and COPY the subdirectories directly (wren-base, wren-main,
wren-server, wren-tests, trino-parser) so the files are found during docker
build; if you intentionally build from the parent, alternatively adjust the
docker build context/command to point to the parent directory instead of
changing the Dockerfile.
| RUN \ | ||
| apt update && \ | ||
| apt -y install curl gpg lsb-release && \ | ||
| curl -fsSL https://www.postgresql.org/media/keys/ACCC4CF8.asc | gpg --dearmor -o /etc/apt/trusted.gpg.d/postgresql.gpg && \ | ||
| echo "deb http://apt.postgresql.org/pub/repos/apt/ `lsb_release -cs`-pgdg main" | tee /etc/apt/sources.list.d/pgdg.list && \ | ||
| apt update && \ | ||
| apt -y install postgresql-client-13 |
There was a problem hiding this comment.
Missing Oracle Instant Client installation for Oracle ADB integration.
This PR introduces Oracle 19c database support, but neither the builder nor runtime stages install the Oracle Instant Client libraries required for Oracle connectivity. The AI summary explicitly mentions "Docker environment updates with Oracle Instant Client" as a key feature. Without these libraries, the application cannot connect to Oracle databases.
🔎 Proposed fix to install Oracle Instant Client
RUN \
apt update && \
apt -y install curl gpg lsb-release && \
curl -fsSL https://www.postgresql.org/media/keys/ACCC4CF8.asc | gpg --dearmor -o /etc/apt/trusted.gpg.d/postgresql.gpg && \
echo "deb http://apt.postgresql.org/pub/repos/apt/ `lsb_release -cs`-pgdg main" | tee /etc/apt/sources.list.d/pgdg.list && \
apt update && \
- apt -y install postgresql-client-13
+ apt -y install postgresql-client-13 && \
+ apt -y install wget unzip libaio1 && \
+ wget https://download.oracle.com/otn_software/linux/instantclient/1923000/instantclient-basic-linux.x64-19.23.0.0.0dbru.zip -O /tmp/oracle.zip && \
+ unzip /tmp/oracle.zip -d /opt/oracle && \
+ rm /tmp/oracle.zip && \
+ sh -c "echo /opt/oracle/instantclient_19_23 > /etc/ld.so.conf.d/oracle-instantclient.conf" && \
+ ldconfigAlternatively, you may need to set environment variables:
ENV LD_LIBRARY_PATH=/opt/oracle/instantclient_19_23:$LD_LIBRARY_PATH
ENV PATH=/opt/oracle/instantclient_19_23:$PATH🤖 Prompt for AI Agents
In wren-core-legacy/docker/Dockerfile.multistage around lines 28-34, the
Dockerfile installs PostgreSQL client but omits Oracle Instant Client required
for Oracle ADB; add steps in both the builder and runtime stages to download and
install the appropriate Oracle Instant Client RPMs or ZIPs (e.g., basic and sdk
for 19c/19.23), install required dependencies (libaio, unzip or alien if
converting RPMs), extract or install them into /opt/oracle/instantclient_19_xx,
set ENV LD_LIBRARY_PATH and PATH to include that directory, and ensure file
permissions and symlinks (libclntsh.so) are correct so the app can load the
Oracle libraries at runtime.
| # Copy the built jar from builder stage | ||
| COPY --from=builder /wren-server-executable.jar ./wren-server-executable.jar | ||
|
|
||
| COPY entrypoint.sh ./ |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Find all entrypoint.sh files in the repository
echo "=== Finding entrypoint.sh files ==="
fd -t f "entrypoint.sh" wren-core-legacy/ 2>/dev/null || echo "No entrypoint.sh found in wren-core-legacy/"
# List docker files in wren-core-legacy/docker/
echo -e "\n=== Docker files in wren-core-legacy/docker/ ==="
fd -t f "Dockerfile" wren-core-legacy/docker/ 2>/dev/null || echo "No Dockerfile found"
# Check if Dockerfile.multistage exists
echo -e "\n=== Checking Dockerfile.multistage ==="
if [ -f "wren-core-legacy/docker/Dockerfile.multistage" ]; then
echo "File exists"
wc -l "wren-core-legacy/docker/Dockerfile.multistage"
else
echo "File does not exist"
fi
# Check if Dockerfile.build exists
echo -e "\n=== Checking Dockerfile.build ==="
if [ -f "wren-core-legacy/docker/Dockerfile.build" ]; then
echo "File exists"
wc -l "wren-core-legacy/docker/Dockerfile.build"
else
echo "File does not exist"
fiRepository: Canner/wren-engine
Length of output: 499
🏁 Script executed:
#!/bin/bash
echo "=== Dockerfile.multistage (full content with line numbers) ==="
cat -n wren-core-legacy/docker/Dockerfile.multistage
echo -e "\n=== Dockerfile.build (full content with line numbers) ==="
cat -n wren-core-legacy/docker/Dockerfile.buildRepository: Canner/wren-engine
Length of output: 3752
🏁 Script executed:
#!/bin/bash
# Search for docker build commands or references to these Dockerfiles
echo "=== Searching for docker build commands ==="
fd -t f "\(Makefile\|\.yaml\|\.yml\|docker-compose\|\.sh\)" wren-core-legacy/ | head -20 | while read file; do
if grep -l "Dockerfile.multistage\|Dockerfile.build\|docker build" "$file" 2>/dev/null; then
echo "Found in: $file"
fi
done
# Check root directory
echo -e "\n=== Searching in root directory ==="
fd -t f "\(Makefile\|docker-compose\|\.github\)" . -d 2 | while read file; do
if grep -l "Dockerfile.multistage\|Dockerfile.build\|docker build" "$file" 2>/dev/null; then
echo "Found in: $file"
fi
done
# Check for GitHub Actions workflows
echo -e "\n=== GitHub Actions workflows ==="
fd -t f "\.yml\|\.yaml" .github/workflows/ 2>/dev/null | while read file; do
if grep -l "docker\|Dockerfile" "$file" 2>/dev/null; then
echo "Found in: $file"
fi
doneRepository: Canner/wren-engine
Length of output: 176
Fix entrypoint script path in Dockerfile.multistage (line 39).
The path entrypoint.sh is incorrect. Based on the build context pattern used in earlier COPY commands (e.g., ../wren-core-legacy/.mvn), the correct path should be ../wren-core-legacy/docker/entrypoint.sh or wren-core-legacy/docker/entrypoint.sh. The current path would cause the Docker build to fail with "file not found" since the entrypoint script is located in wren-core-legacy/docker/entrypoint.sh, not at the root of the build context.
🤖 Prompt for AI Agents
In wren-core-legacy/docker/Dockerfile.multistage around line 39, the COPY uses
an incorrect relative path "entrypoint.sh" which causes Docker build to fail
because the script lives in wren-core-legacy/docker; update the COPY source to
the correct path (e.g., ../wren-core-legacy/docker/entrypoint.sh or
wren-core-legacy/docker/entrypoint.sh to match the earlier COPY patterns) so the
file is present in the build context and copied into the image.
Summary by CodeRabbit
Release Notes
New Features
Bug Fixes
Documentation
Tests
Chores
✏️ Tip: You can customize this high-level summary in your review settings.