Skip to content

Oracle adb integration#1392

Closed
jsatlien wants to merge 21 commits intoCanner:mainfrom
RepairQ:oracle-adb-integration
Closed

Oracle adb integration#1392
jsatlien wants to merge 21 commits intoCanner:mainfrom
RepairQ:oracle-adb-integration

Conversation

@jsatlien
Copy link
Copy Markdown

@jsatlien jsatlien commented Dec 19, 2025

Summary by CodeRabbit

Release Notes

  • New Features

    • Added support for Oracle 19c database connections and metadata discovery
    • Implemented custom SQLGlot dialect for Oracle 19c SQL compatibility
    • Added DSN support for Oracle connection configuration
  • Bug Fixes

    • Fixed schema query errors in Oracle metadata retrieval
  • Documentation

    • Added Oracle 19c compatibility guide with testing instructions
  • Tests

    • Added comprehensive test suite covering Oracle dialect transformations, date arithmetic, pagination, and type mapping
  • Chores

    • Updated Docker base images to Python 3.11-bookworm
    • Added Oracle Instant Client runtime dependencies to container environment

✏️ Tip: You can customize this high-level summary in your review settings.

VIEWS-ONLY ARCHITECTURE:
- Remove table support - discover Oracle VIEWS exclusively
- Optimized for view-based reporting databases
- Aligns with business model where tables are internal

QUICK WINS (Phase 4.2):
1. Dynamic schema owner (fixes hardcoded 'SYSTEM')
   - Extract user from connection_info dynamically
   - Supports any Oracle schema, not just SYSTEM

2. Views-only discovery (simplified from UNION ALL)
   - Query all_views directly, not all_tables
   - Cleaner SQL, faster execution
   - Returns exactly 89 views (not 122 mixed objects)

3. Quoted identifier support for view names with spaces
   - Detects spaces in view names
   - Wraps in double quotes: SCHEMA."RT Customer"
   - Critical for 99% of reporting views

4. Permission-safe version detection
   - Try v$instance, fallback to hardcoded version
   - Never fails on permission issues
   - Returns 19.0.0.0.0 for Oracle ADB 19c

IMPACT:
- Simpler code (single path, not dual)
- Faster metadata discovery (no table queries)
- Better aligned with reporting database architecture

Ref: IMPLEMENTATION_PLAN.md Phase 4.2
Ref: ARCHITECTURAL_DECISION.md
Ref: keep_me/oracle_modifications/oracle.py (proven solutions)
- Replaced get_constraints() with intelligent auto-detection algorithm
- Query all_dependencies to map views → base tables
- Query all_constraints to get FK relationships from base tables
- Create view-to-view relationships when views share related tables
- Result: Auto-detect 4,841 relationships across 99 views
- Zero configuration required - all discovered from database metadata
- Handles quoted identifiers for views with spaces
- Graceful fallback if permissions fail

Tested: 4,841 relationships validated with REPORTS_ASSURANTDEV schema
- Fixed TypeError in oracle_backend_patch.py from_string() call
- Removed precision and scale parameters that aren't supported by ibis type_mapper
- Simplified to pass only base data type and proper nullable boolean
- This fixes all Oracle queries including preview data, natural language queries, etc.
- Resolves: SqlParseType.from_string() got unexpected keyword argument 'precision'
…DSN support

Oracle 19c Autonomous Database integration for views-only reporting architecture.

Key Changes:
- Oracle Instant Client 19.23 integration in Dockerfile (Debian Bookworm upgrade)
- DSN connection support for Oracle ADB (alternative to host/port/database)
- Views-only metadata discovery (no tables exposed, optimized for reporting DBs)
- Quoted identifier support for Oracle views/columns with spaces
- Disabled relationship auto-detection (returns empty list to prevent 30+ min hangs)

Technical Details:
- oracle.py: Returns quoted table names like '"RT SN Claim"' (no schema prefix)
- data_source.py: DSN parameter support with fallback to host/port/database
- connector.py: Oracle backend patch integration, debug logging added
- __init__.py: Made host/port/database optional for OracleConnectionInfo
- Dockerfile: Oracle Instant Client libs, LD_LIBRARY_PATH, ORACLE_HOME env vars

Architecture:
- Views represent business entities for reporting queries
- Relationships must be manually defined in UI (no FK metadata in views)
- Permission-safe version detection (fallback to 19.0.0.0.0)
- Dynamic user extraction from connection info (no hardcoded schema names)

Testing:
- Tested with REPORTS_ASSURANTDEV user on Oracle ADB 19c
- 99 views discovered successfully with quoted identifiers preserved
- Manual relationship definition confirmed working
- Data preview working on views with spaces in names

Related: Main repo commit e9332392 contains companion wren-ui changes
Create custom Oracle 19c dialect class to override SQLGlot's default Oracle
dialect for 19c compatibility.

Implementation:
- Created app/custom_sqlglot/dialects/oracle.py
- Oracle class inherits from SQLGlot's Oracle dialect
- Generator subclass prepared for future overrides
- Comprehensive docstring documenting compatibility goals

Implements: COMP-001, FR-004
Task: P1.001 - Create Oracle 19c Dialect File Structure
Phase: 1 - Foundation

Related: SCAIS-23
Register Oracle 19c dialect in dialects module for automatic SQLGlot discovery.

Implementation:
- Added Oracle import to app/custom_sqlglot/dialects/__init__.py
- Follows existing MySQL dialect registration pattern
- Enables automatic dialect override when write='oracle'

Implements: COMP-005, FR-004
Task: P1.002 - Register Custom Oracle Dialect
Phase: 1 - Foundation
Depends On: P1.001

Related: SCAIS-23
Add debug logging to track when Oracle 19c dialect is used for SQL generation.

Implementation:
- Added loguru import to oracle.py
- Implemented Generator.__init__() with logging
- Debug log message for observability
- Maintains parent functionality via super()

Implements: COMP-001, INT-002, NFR-005
Task: P1.004 - Add Logging for Dialect Usage
Phase: 1 - Foundation (Complete)
Depends On: P1.001

Related: SCAIS-23
…ibility

- Map BOOLEAN type to CHAR(1) for Oracle 19c compatibility
- Oracle 19c doesn't support native BOOLEAN type (21c+ feature)
- Follows Y/N convention for boolean representation
- Inherits all other type mappings from base Oracle dialect
- Updates docstring to reflect CHAR(1) Y/N convention

Implements: SCAIS-23 P2.001 (COMP-002, FR-004)
Related: Oracle 19c compatibility requirement
- Add _dateadd_oracle19c method to handle DateAdd expressions
- Convert INTERVAL DAY syntax to numeric addition (date + n)
- Include warning logging for unsupported interval units
- Fallback to default behavior for non-INTERVAL expressions
- Handles uppercase unit comparison and null safety

Implements: SCAIS-23 P2.002 (COMP-003, FR-001)
Related: Oracle 19c date arithmetic compatibility
- Add _datesub_oracle19c method to handle DateSub expressions
- Convert INTERVAL DAY syntax to numeric subtraction (date - n)
- Include warning logging for unsupported interval units
- Fallback to default behavior for non-INTERVAL expressions
- Mirrors _dateadd_oracle19c pattern for consistency

Implements: SCAIS-23 P2.003 (COMP-003, FR-001)
Related: Oracle 19c date arithmetic compatibility
- Add MONTH unit support to _dateadd_oracle19c method
- Add MONTH unit support to _datesub_oracle19c method
- Use ADD_MONTHS() function for month arithmetic
- Handle month-end edge cases correctly (e.g., Jan 31 + 1 month = Feb 28/29)
- Update docstrings with MONTH examples
- Add explanatory comments for edge case handling

Implements: SCAIS-23 P2.004 (COMP-003, FR-001)
Related: Oracle 19c date arithmetic compatibility
- Add YEAR unit support to _dateadd_oracle19c method
- Add YEAR unit support to _datesub_oracle19c method
- Convert years to months (n * 12) using ADD_MONTHS() function
- Handle leap year edge cases correctly through ADD_MONTHS
- Update docstrings with YEAR unit examples
- Add explanatory comments for conversion logic

Implements: SCAIS-23 P2.005 (COMP-003, FR-001)
Related: Oracle 19c date arithmetic compatibility
…onary

- Add TRANSFORMS dictionary to Oracle.Generator class
- Register exp.DateAdd -> _dateadd_oracle19c mapping
- Register exp.DateSub -> _datesub_oracle19c mapping
- Inherit base transforms from OriginalOracle.Generator
- Use lambda functions for proper instance method binding
- Add comments explaining transform registration purpose

CRITICAL: This activates all date arithmetic functionality from P2.002-P2.005
SQLGlot will now call custom methods when encountering DateAdd/DateSub AST nodes

Implements: SCAIS-23 P2.006 (COMP-003, FR-001)
Related: Oracle 19c date arithmetic transform activation
- Create comprehensive pagination validation test suite
- Verify LIMIT converts to FETCH FIRST (Oracle 12c+ syntax)
- Verify OFFSET syntax compatibility
- Test pagination with WHERE, ORDER BY, JOIN clauses
- Validate custom dialect preserves pagination behavior
- Confirm integration with date arithmetic transforms

VALIDATION RESULT: SQLGlot already generates Oracle 19c-compatible
pagination syntax. No custom implementation needed.

Implements: SCAIS-23 P2.007 (FR-002)
Related: Oracle 19c pagination compatibility validation
Fix value extraction bug in Oracle 19c date arithmetic transformations.
Previously, self.sql(interval, 'this') returned quoted strings ('1')
instead of raw numbers (1), causing incorrect SQL generation.

Changes:
- Access interval.this.this directly to get raw Literal value
- Add fallback to self.sql() for non-Literal expressions
- Applies to both _dateadd_oracle19c() and _datesub_oracle19c()

Result:
- DAY: hire_date + INTERVAL '1' DAY → hire_date + 1 (not hire_date + '1')
- MONTH: hire_date + INTERVAL '3' MONTH → ADD_MONTHS(hire_date, 3)
- YEAR: hire_date + INTERVAL '1' YEAR → ADD_MONTHS(hire_date, 1 * 12)

All standalone tests passing (9/9).

Related: SCAIS-23 P3.001
…etic

Add unit tests validating date arithmetic transformations for Oracle 19c.

Test Coverage:
- DAY unit: Addition and subtraction with numeric values
- MONTH unit: ADD_MONTHS function for addition and subtraction
- YEAR unit: ADD_MONTHS with year-to-month conversion (*12)
- Complex scenarios: WHERE clauses, mixed operations, multiple intervals

Files:
- test_oracle_19c_dialect.py: 14 comprehensive pytest tests (163 lines)
- test_oracle_19c_standalone.py: 9 standalone tests (137 lines)

Standalone tests can run without full app setup:
  poetry run python tests/custom_sqlglot/test_oracle_19c_standalone.py

All tests passing (9/9 standalone validated).

Related: SCAIS-23 P3.001
Major test infrastructure improvements:
1. Isolated conftest.py in tests/custom_sqlglot/ (no app dependencies)
2. Added pytest markers for Oracle 19c dialect tests
3. Created reusable test fixtures (transpile_trino_to_oracle, etc.)
4. Refactored date arithmetic tests to use fixtures
5. Renamed test file for clarity: test_oracle_19c_dialect.py → test_oracle_19c_date_arithmetic.py
6. Removed standalone test runner (no longer needed)

New Type Mapping Tests (P3.002):
- test_boolean_type_maps_to_char1: Core requirement validation
- test_type_mapping_inheritance: Verify spread operator works
- test_date_type_preserved: DATE type unchanged (if present)
- test_timestamp_type_preserved: TIMESTAMP type unchanged (if present)
- test_varchar_type_preserved: VARCHAR maps to VARCHAR2
- test_number_type_preserved: Numeric types unchanged
- test_boolean_override_is_intentional: Document override behavior
- test_oracle_specific_types: Parametrized tests for CHAR, NCHAR, VARCHAR, NVARCHAR

Test Results:
- 25 Oracle 19c tests passing (14 date arithmetic + 11 type mapping)
- Tests run in 0.10s (previously couldn't run with pytest)
- No app setup required (WREN_ENGINE_ENDPOINT not needed)

Shell Helper:
- Created ~/wren-test-oracle.sh for easy test execution
- Usage: bash ~/wren-test-oracle.sh or wren-test-oracle (aliased)

Related: SCAIS-23 P3.002
- Create test_oracle_19c_integration.py with 9 comprehensive tests
- Validate full transpilation flow from Trino to Oracle 19c
- Test date arithmetic, type mapping, and combined scenarios
- Include realistic queries: JOINs, subqueries, CTEs, CASE expressions
- Verify 19c syntax present and 21c+ syntax absent
- All 34 Oracle 19c tests passing (14 date + 11 type + 9 integration)

Implements: P3.003 - Integration Tests for Full Transpilation
References: SCAIS-23, TEST-002, US-001, INT-001
- Create oracle_19c_dialect.md with complete usage guide
- Update README.md with Oracle 19c testing instructions
- Document all 34 tests (14 date + 11 type + 9 integration)
- Include examples, validation, and known limitations
- Add troubleshooting and support information

Implements: P3.008 - Document Test Execution
References: SCAIS-23
- Map TIMESTAMPTZ to TIMESTAMP in metadata extraction (oracle.py)
- Oracle 19c doesn't support TIMESTAMP WITH TIME ZONE
- Added test files for validation
- Resolves ORA-01843 errors when querying TIMESTAMPTZ columns
@github-actions github-actions bot added documentation Improvements or additions to documentation ibis dependencies Pull requests that update a dependency file python Pull requests that update Python code labels Dec 19, 2025
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Dec 19, 2025

Walkthrough

Introduces Oracle 19c database support to ibis-server through a custom SQLGlot dialect, backend patching to fix metadata schema queries, views-based metadata implementation, and Docker environment updates with Oracle Instant Client. Includes DSN connection options, type mapping overrides (BOOLEAN to CHAR(1), TIMESTAMPTZ to TIMESTAMP), custom CAST handling, and extensive test coverage.

Changes

Cohort / File(s) Summary
Oracle Dialect Implementation
ibis-server/app/custom_sqlglot/dialects/oracle.py, ibis-server/app/custom_sqlglot/dialects/__init__.py
Added custom Oracle dialect class extending SQLGlot's base Oracle, with TYPE_MAPPING overrides (BOOLEAN → CHAR(1), TIMESTAMPTZ → TIMESTAMP) and custom CAST handler for string-to-date/timestamp conversions using TO_DATE/TO_TIMESTAMP with format masks.
Oracle Backend Patching
ibis-server/app/model/oracle_backend_patch.py
New patched Oracle backend to fix ORA-00923 metadata schema errors via transformed queries, temporary view creation, and CASE WHEN-based nullable computation. Includes runtime monkey-patching function.
Oracle Metadata Implementation
ibis-server/app/model/metadata/oracle.py
Rewrote metadata layer to use views instead of tables, added dynamic user extraction, enhanced type normalization stripping precision/scale, expanded type mapping (BFILE, RAW, ROWID, JSON types, TIMESTAMPTZ variants), simplified table name formatting, and disabled constraints retrieval.
Model & Connection Updates
ibis-server/app/model/__init__.py, ibis-server/app/model/data_source.py
Made OracleConnectionInfo fields optional (host, port, database) and added DSN field; updated data_source.py to support DSN-based connections or fallback to separate host/port/database parameters.
Connector Module Integration
ibis-server/app/model/connector.py, ibis-server/app/routers/v2/connector.py
Added logging and Oracle backend patch initialization in connector.py; added debug output for SQL/limit in SimpleConnector.query and dry_run; disabled constraint retrieval in router endpoint.
Dockerfile & Environment Setup
ibis-server/Dockerfile, wren-core-legacy/docker/Dockerfile.build, wren-core-legacy/docker/Dockerfile.multistage
Upgraded base images from Python 3.11-buster to 3.11-bookworm, updated MS SQL Server APT repository for bookworm; added Oracle Instant Client installation, download, and configuration (LD_LIBRARY_PATH, ORACLE_HOME); added new multi-stage Dockerfiles for wren-engine builds.
Test Suite: Oracle 19c Dialect
ibis-server/tests/custom_sqlglot/conftest.py, ibis-server/tests/custom_sqlglot/test_oracle_19c_*.py, ibis-server/tests/custom_sqlglot/test_oracle_pagination.py
Added pytest fixtures for oracle_dialect, oracle_generator, type_mapping comparisons; comprehensive test modules for date arithmetic transformations, type mapping, integration scenarios, pagination syntax, and nested/complex query handling.
Standalone Test Scripts
ibis-server/test_*.py
Added exploratory test scripts: test_default_oracle.py (default dialect behavior), test_interval_support.py (INTERVAL handling), test_oracle_views.py (views-based metadata validation), test_timestamptz_issue.py (TIMESTAMPTZ/TO_DATE handling).
Documentation & Configuration
ibis-server/README.md, ibis-server/docs/oracle_19c_dialect.md, ibis-server/pyproject.toml
Added Oracle 19c compatibility section to README with test instructions; new dialect documentation detailing custom implementation, type mapping, date arithmetic, and test coverage; added pytest markers (oracle19c, dialect, type_mapping, date_arithmetic).

Sequence Diagram

sequenceDiagram
    participant Client as Client Request
    participant Connector as SimpleConnector
    participant Datasource as DataSource
    participant Ibis as Ibis Oracle Backend
    participant Patch as PatchedOracleBackend
    participant Oracle as Oracle Database
    
    Client->>Connector: query(sql)
    Note over Connector: Logs SQL & limit
    Connector->>Datasource: get_oracle_connection()
    
    alt DSN Provided
        Datasource->>Datasource: use_dsn_params
    else No DSN
        Datasource->>Datasource: use_host/port/database
    end
    
    Datasource->>Ibis: ibis.oracle.connect(params)
    Ibis->>Patch: _get_schema_using_query(sql)
    
    Note over Patch: Parse & transform query
    Patch->>Oracle: CREATE TEMPORARY VIEW
    Patch->>Oracle: Query all_tab_columns with CASE WHEN
    Oracle-->>Patch: Column metadata
    Patch->>Patch: Build Schema (with nullability)
    Patch->>Oracle: DROP TEMPORARY VIEW
    Patch-->>Ibis: Schema
    
    Ibis-->>Connector: Connection/Execution Result
    Connector-->>Client: Query Result
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Areas requiring extra attention:

  • Backend patching logic (oracle_backend_patch.py): Verify temporary view creation/cleanup safety, CASE WHEN null computation correctness, and schema building with Oracle type mapping
  • Dialect transformation logic (oracle.py): Review CAST handler pattern matching for date/timestamp string literals, TYPE_MAPPING overrides for BOOLEAN/TIMESTAMPTZ, and fallback behavior
  • Views-based metadata (metadata/oracle.py): Validate view-only architecture justification, dynamic user extraction, type normalization (precision/scale stripping), and implications of disabled constraints
  • Connection routing (data_source.py, __init__.py): Confirm DSN vs. host/port fallback logic correctness and field optionality semantics
  • Constraint retrieval disable (routers/v2/connector.py): Verify intentional short-circuit to empty list and potential impact on consumers

Possibly related PRs

Suggested labels

oracle, dialect, ibis, database-support, testing

Suggested reviewers

  • douenergy
  • goldmedal

Poem

🐰 A custom dialect hops into view,
Oracle 19c now sings on cue,
Views dance where tables once stood tall,
Backend patches catch them all,
Type-mapped booleans in CHAR(1) play,
Hoppy queries find their way! 🌱

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'Oracle adb integration' directly summarizes the main change: adding Oracle Autonomous Database (ADB) support across multiple components including custom SQLGlot dialect, metadata handling, Docker configuration, and connection management.
Docstring Coverage ✅ Passed Docstring coverage is 86.15% which is sufficient. The required threshold is 80.00%.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@jsatlien jsatlien closed this Dec 19, 2025
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 15

🧹 Nitpick comments (20)
wren-core-legacy/docker/Dockerfile.build (1)

9-9: Prefer selective file copying over copying everything.

Copying all files (.) includes unnecessary artifacts like .git, IDE configurations, and build outputs, which increases image size and build context transfer time. Consider using a .dockerignore file or copying only required directories (similar to Dockerfile.multistage's approach).

🔎 Example of selective copying
-# Copy all source files
-COPY . .
+# Copy Maven wrapper and source files
+COPY .mvn .mvn
+COPY mvnw .
+COPY pom.xml .
+COPY wren-base ./wren-base
+COPY wren-main ./wren-main
+COPY wren-server ./wren-server
+COPY wren-tests ./wren-tests
+COPY trino-parser ./trino-parser
ibis-server/app/model/metadata/oracle.py (3)

3-3: Unnecessary import from typing in Python 3.11+.

The List import from typing is not needed. Python 3.11 supports built-in generic type hints like list[Constraint].

🔎 Proposed fix
-from typing import List

And update line 127:

-    def get_constraints(self) -> List[Constraint]:
+    def get_constraints(self) -> list[Constraint]:

311-315: Move import re to module level.

The re module is imported inside the function, which is executed for every column. While Python caches imports, placing it at the module level is more idiomatic.

🔎 Proposed fix

Add at the top of the file with other imports:

import re

Then remove line 314:

     def _transform_column_type(self, data_type):
         # Strip precision/scale qualifiers from Oracle types before mapping
         # e.g., "TIMESTAMP(6)" -> "TIMESTAMP", "NUMBER(10,2)" -> "NUMBER"
-        import re
         normalized_type = re.sub(r'\([^)]*\)', '', data_type.upper()).strip()

275-286: Reasonable fallback, but consider logging the exception.

The fallback to a hardcoded version is pragmatic for environments without v$instance permissions. Consider logging the exception for debugging purposes rather than silently swallowing it.

🔎 Proposed enhancement
+from loguru import logger
+
         except Exception:
-            # Fallback: Return hardcoded version for Oracle ADB 19c
-            # This ensures metadata discovery never fails on version check
+            # Fallback: Return hardcoded version for Oracle ADB 19c
+            logger.debug("Could not query v$instance, using default version")
             return "19.0.0.0.0"
ibis-server/tests/custom_sqlglot/test_oracle_19c_date_arithmetic.py (3)

86-89: Consider more flexible assertion for generated SQL.

The assertion checks for specific formatting patterns like ", 2 * 12)" with spacing variations. While this works, it's somewhat brittle to SQL formatting changes.

Alternative approach using regex or AST validation
-        assert "ADD_MONTHS(" in result.upper()
-        assert "* 12)" in result
-        assert ", 2 * 12)" in result or ", 2*12)" in result.replace(" ", "")
-        assert "INTERVAL" not in result.upper()
+        assert "ADD_MONTHS(" in result.upper()
+        # Use regex to match ADD_MONTHS with multiplication, ignoring whitespace
+        import re
+        assert re.search(r'ADD_MONTHS\([^,]+,\s*2\s*\*\s*12\)', result, re.IGNORECASE)
+        assert "INTERVAL" not in result.upper()

Alternatively, parse the result and validate the AST structure rather than string matching.


96-99: Same brittle assertion pattern as year addition test.

The multiple spacing variations ("-(1 * 12)", "-( 1 * 12 )", etc.) make the test fragile to formatting changes.

Consider using regex pattern matching as suggested in the year addition test above.


170-172: Brittle assertion with multiple spacing variations.

Similar to previous comments, the multiple string patterns for matching the multiplication syntax are fragile.

Apply the same regex-based approach suggested in earlier comments for consistency across the test suite.

ibis-server/app/model/connector.py (1)

89-89: Adjust dry_run logging level.

Similar to the query method, this debug logging with emoji should be adjusted. If logging is needed, use logger.debug() and remove the emoji.

     @tracer.start_as_current_span("connector_dry_run", kind=trace.SpanKind.CLIENT)
     def dry_run(self, sql: str) -> None:
-        logger.info(f"🔍 CONNECTOR DRY_RUN: {sql}")
+        logger.debug(f"Executing dry run")
         self.connection.sql(sql)
ibis-server/test_default_oracle.py (1)

1-27: Relocate diagnostic script or convert to proper tests.

This file appears to be an exploratory/diagnostic script rather than a proper test suite:

  • No pytest structure (no test functions, no assertions)
  • Uses print statements for manual inspection
  • Located at repository root instead of tests/ directory

Recommendation: Choose one of these options:

  1. Convert to proper pytest tests (if validation is needed):
"""Test SQLGlot's default Oracle dialect behavior."""
import pytest
import sqlglot

@pytest.mark.oracle_dialect
def test_default_oracle_interval_preservation():
    """Verify default Oracle dialect preserves INTERVAL syntax."""
    sql = "SELECT systimestamp + INTERVAL '1' DAY FROM dual"
    result = sqlglot.transpile(sql, read='oracle', write='oracle')
    assert "INTERVAL '1' DAY" in result[0]

@pytest.mark.oracle_dialect  
def test_trino_to_oracle_transpilation():
    """Verify Trino to Oracle transpilation."""
    sql = "SELECT order_date + INTERVAL '1' DAY FROM orders"
    result = sqlglot.transpile(sql, read='trino', write='oracle')
    assert result[0]  # At minimum, verify it doesn't error
  1. Move to scripts/ directory if this is meant for manual diagnostics:
mv test_default_oracle.py scripts/debug_oracle_dialect.py
  1. Remove if no longer needed for development.
ibis-server/test_interval_support.py (1)

1-37: Relocate or convert diagnostic script to proper tests.

This diagnostic script has the same structural issues as test_default_oracle.py:

  • No pytest test structure or assertions
  • Uses print statements for manual inspection
  • Located at repository root

Apply the same remediation as suggested for test_default_oracle.py:

  1. Convert to pytest tests under tests/custom_sqlglot/:
"""Tests for Oracle 19c INTERVAL support."""
import pytest
import sqlglot
from app.custom_sqlglot.dialects.oracle import Oracle

@pytest.mark.oracle19c
class TestOracleIntervalSupport:
    def test_default_dialect_preserves_interval(self):
        sql = "SELECT order_date + INTERVAL '1' DAY FROM orders"
        result = sqlglot.transpile(sql, read='trino', write='oracle')
        assert "INTERVAL" in result[0].upper()
    
    def test_custom_dialect_transforms_interval(self):
        sql = "SELECT order_date + INTERVAL '1' DAY FROM orders"  
        result = sqlglot.transpile(sql, read='trino', write=Oracle)
        # Add appropriate assertions based on expected behavior
        assert result[0]
  1. Move to scripts/ if needed for debugging:
mv test_interval_support.py scripts/debug_interval_support.py
  1. Remove if obsolete.
ibis-server/test_timestamptz_issue.py (1)

1-66: Relocate or convert TIMESTAMPTZ diagnostic script.

This is the third diagnostic script at the root level with similar issues:

  • No pytest structure or assertions
  • Manual execution required
  • Try-except blocks with print output for inspection

Recommendation: Convert to proper tests under tests/custom_sqlglot/:

"""Tests for Oracle TIMESTAMPTZ handling."""
import pytest
import sqlglot
from app.custom_sqlglot.dialects.oracle import Oracle

@pytest.mark.oracle19c
class TestOracleTimestampTZ:
    def test_trino_to_oracle_timestamptz_cast(self):
        """Verify TIMESTAMPTZ CAST transpilation doesn't error."""
        sql = "SELECT CAST(col AS TIMESTAMPTZ) FROM tbl"
        # Should not raise exception
        result = sqlglot.transpile(sql, read='trino', write=Oracle)
        assert result[0]
        # Add assertions for expected Oracle output
    
    def test_timestamptz_type_mapping(self):
        """Verify TIMESTAMPTZ maps to TIMESTAMP in Oracle 19c."""
        sql = "SELECT CAST(col AS TIMESTAMPTZ) FROM tbl"
        result = sqlglot.transpile(sql, read='trino', write=Oracle)
        # Based on Oracle dialect TYPE_MAPPING, should convert to TIMESTAMP
        assert "TIMESTAMP" in result[0].upper()
        
    @pytest.mark.parametrize("cast_type", [
        "TIMESTAMP",
        "TIMESTAMP WITH TIME ZONE",
    ])
    def test_oracle_timestamp_variants(self, cast_type):
        """Test Oracle timestamp type handling."""
        sql = f"SELECT CAST(col AS {cast_type}) FROM tbl"
        result = sqlglot.transpile(sql, read='oracle', write='oracle')
        assert result[0]  # Should not error

If these scripts are still useful for debugging, consolidate them into a single scripts/debug_oracle_dialect.py file.

ibis-server/test_oracle_views.py (2)

19-20: Avoid sys.path manipulation; consider relocating the test file.

The sys.path.insert is a code smell and may cause import issues. This file appears to be at the repository root (ibis-server/test_oracle_views.py) rather than in tests/. Consider moving it to the proper test directory structure.


115-115: Testing private method _format_compact_table_name directly.

Accessing private methods (prefixed with _) in tests couples the test to implementation details. If the method is essential for testing, consider making it public or testing through the public API.

ibis-server/docs/oracle_19c_dialect.md (1)

133-134: Remove user-specific file references from shared documentation.

References to ~/.bashrc and ~/wren-test-oracle.sh are user-specific and should not be in project documentation. These belong in a local setup guide or developer notes, not in the tracked docs.

🔎 Suggested fix
 ### Configuration
 - ✅ `pyproject.toml` - Added pytest markers (oracle19c, dialect, type_mapping, date_arithmetic)
-- ✅ `~/wren-test-oracle.sh` - Shell script for easy test execution
-- ✅ `~/.bashrc` - Added aliases (wren-test-oracle, wren-poetry, wren-cd)
+
+For local convenience, you may create shell aliases for test execution (see Running Tests section above).
ibis-server/tests/custom_sqlglot/test_oracle_pagination.py (1)

124-128: Assertion is fragile and has inconsistent case handling.

The assertion "- 7" in result or "ADD_MONTHS" in result.upper() mixes case-sensitive and case-insensitive checks. The space in "- 7" is also fragile as output formatting may vary.

🔎 Suggested fix for consistency
         # Should have both date arithmetic fix AND pagination
         assert "FETCH FIRST 20 ROWS ONLY" in result.upper()
         # Date arithmetic should use numeric subtraction (from P2.002-P2.006)
-        assert "- 7" in result or "ADD_MONTHS" in result.upper()
+        result_upper = result.upper()
+        assert "-7" in result.replace(" ", "") or "- 7" in result or "ADD_MONTHS" in result_upper, \
+            f"Expected date arithmetic transformation, got: {result}"
         assert "LIMIT" not in result.upper()
ibis-server/app/model/oracle_backend_patch.py (2)

68-78: Consider using parameterized query for metadata lookup.

While name is generated internally via gen_name(), building SQL with f-strings is a pattern that could lead to injection if reused elsewhere. Using bind parameters would be safer.

🔎 Suggested improvement using bind parameters
                 # Fixed version: Use raw SQL with CASE WHEN
-                metadata_sql = f"""
+                # Use bind parameter for table_name to follow safe SQL practices
+                metadata_sql = """
                     SELECT
                         column_name,
                         data_type,
                         data_precision,
                         data_scale,
                         CASE WHEN nullable = 'Y' THEN 1 ELSE 0 END as is_nullable
                     FROM all_tab_columns
-                    WHERE table_name = '{name}'
+                    WHERE table_name = :view_name
                     ORDER BY column_id
                 """

                 logger.debug(f"Querying metadata: {metadata_sql}")
-                results = con.execute(metadata_sql).fetchall()
+                results = con.execute(metadata_sql, {"view_name": name}).fetchall()

40-46: Complex lambda transformer is hard to read and maintain.

The inline lambda with multiple conditions reduces readability. Consider extracting to a named function with clearer documentation.

🔎 Suggested refactor for clarity
+        def _quote_columns(node):
+            """Add quoting to Column nodes except in ORDER BY clauses."""
+            if (isinstance(node, sg.exp.Column)
+                and not node.this.quoted
+                and not isinstance(node.parent, sg.exp.Order)):
+                return node.__class__(this=node.this, quoted=True)
+            return node
+
         with self.begin() as con:
             # Parse and transform the query
             sg_expr = sg.parse_one(query, dialect=dialect)

-            # Apply ibis's transformer to add quoting
-            transformer = lambda node: (
-                node.__class__(this=node.this, quoted=True)
-                if isinstance(node, sg.exp.Column)
-                and not node.this.quoted
-                and not isinstance(node.parent, sg.exp.Order)
-                else node
-            )
-            sg_expr = sg_expr.transform(transformer)
+            # Apply transformer to add quoting
+            sg_expr = sg_expr.transform(_quote_columns)
ibis-server/tests/custom_sqlglot/test_oracle_19c_type_mapping.py (1)

104-109: Prefer pytest.skip() or logging over print() for test diagnostics.

Using print() statements for informational output in tests is not ideal as pytest captures stdout by default. Consider using pytest.warns(), logging, or simply removing informational prints.

🔎 Suggested approach
             # Our mapping should be CHAR(1)
             assert our_mapping == "CHAR(1)"

-            # Document if we're overriding base
-            # (this is informational, not a failure condition)
-            if base_mapping != "CHAR(1)":
-                print(f"INFO: Overriding base BOOLEAN mapping '{base_mapping}' with 'CHAR(1)'")
+            # The override from base is expected behavior, no action needed
         else:
             # Base doesn't have BOOLEAN, we're adding it
             assert oracle_type_mapping[exp.DataType.Type.BOOLEAN] == "CHAR(1)"
-            print("INFO: Adding BOOLEAN → CHAR(1) mapping (not in base)")
+            # Adding BOOLEAN mapping is expected when base lacks it
ibis-server/app/custom_sqlglot/dialects/oracle.py (1)

74-90: Pattern matching could be more robust; consider quote escaping.

Two minor concerns:

  1. Weak pattern validation (line 76): The check only validates dash positions but not that other characters are digits. A malformed string like "abcd-ef-gh" would pass and produce invalid Oracle SQL instead of falling back to default CAST.

  2. Quote escaping (line 90): If literal_value contains a single quote, the generated SQL would be malformed. While rare for date literals, defensive handling would be safer.

🔎 Suggested improvements
-                    # Check if it matches YYYY-MM-DD pattern (with or without time)
-                    # Pattern: YYYY-MM-DD or YYYY-MM-DD HH:MI:SS
-                    if literal_value and len(literal_value) >= 10 and literal_value[4] == '-' and literal_value[7] == '-':
+                    import re
+                    # Validate YYYY-MM-DD pattern more strictly
+                    date_pattern = r'^\d{4}-\d{2}-\d{2}$'
+                    datetime_pattern = r'^\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2}$'
+                    
+                    if literal_value and (re.match(date_pattern, literal_value) or re.match(datetime_pattern, literal_value)):
                         # Determine format based on length
                         if len(literal_value) == 10:
                             # Just date: YYYY-MM-DD
                             format_mask = 'YYYY-MM-DD'
-                        elif len(literal_value) == 19:
+                        else:  # len == 19, validated by regex
                             # Date with time: YYYY-MM-DD HH:MI:SS
                             format_mask = 'YYYY-MM-DD HH24:MI:SS'
-                        else:
-                            # Other length, use default CAST
-                            return self.cast_sql(expression)
 
                         # Use TO_TIMESTAMP or TO_DATE with explicit format
                         func_name = "TO_TIMESTAMP" if target_type.this == exp.DataType.Type.TIMESTAMP else "TO_DATE"
-                        return f"{func_name}('{literal_value}', '{format_mask}')"
+                        # Escape single quotes in literal value
+                        escaped_value = literal_value.replace("'", "''")
+                        return f"{func_name}('{escaped_value}', '{format_mask}')"

Note: Move the import re to the top of the file if you apply this change.

ibis-server/tests/custom_sqlglot/test_oracle_19c_integration.py (1)

36-74: Consider adding test coverage for CAST to TIMESTAMP/DATE handling.

The _handle_cast_oracle19c method in the dialect converts CAST('2025-11-24' AS TIMESTAMP) to TO_TIMESTAMP('2025-11-24', 'YYYY-MM-DD') to fix ORA-01843 errors. This feature isn't exercised by any of the current integration tests.

🔎 Suggested test to add
def test_cast_timestamp_with_string_literal(self, transpile_trino_to_oracle):
    """Test CAST to TIMESTAMP converts to TO_TIMESTAMP with explicit format."""
    sql = "SELECT CAST('2025-11-24 14:30:00' AS TIMESTAMP) FROM dual"
    result = transpile_trino_to_oracle(sql)
    
    # Verify TO_TIMESTAMP with format mask is used
    assert "TO_TIMESTAMP(" in result.upper()
    assert "YYYY-MM-DD HH24:MI:SS" in result
    assert "CAST(" not in result.upper()

def test_cast_date_with_string_literal(self, transpile_trino_to_oracle):
    """Test CAST to DATE converts to TO_DATE with explicit format."""
    sql = "SELECT CAST('2025-11-24' AS DATE) FROM dual"
    result = transpile_trino_to_oracle(sql)
    
    # Verify TO_DATE with format mask is used
    assert "TO_DATE(" in result.upper()
    assert "YYYY-MM-DD" in result
📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 7bc0d22 and bb07ffe.

📒 Files selected for processing (23)
  • ibis-server/Dockerfile (3 hunks)
  • ibis-server/README.md (1 hunks)
  • ibis-server/app/custom_sqlglot/dialects/__init__.py (1 hunks)
  • ibis-server/app/custom_sqlglot/dialects/oracle.py (1 hunks)
  • ibis-server/app/model/__init__.py (2 hunks)
  • ibis-server/app/model/connector.py (4 hunks)
  • ibis-server/app/model/data_source.py (1 hunks)
  • ibis-server/app/model/metadata/oracle.py (6 hunks)
  • ibis-server/app/model/oracle_backend_patch.py (1 hunks)
  • ibis-server/app/routers/v2/connector.py (1 hunks)
  • ibis-server/docs/oracle_19c_dialect.md (1 hunks)
  • ibis-server/pyproject.toml (1 hunks)
  • ibis-server/test_default_oracle.py (1 hunks)
  • ibis-server/test_interval_support.py (1 hunks)
  • ibis-server/test_oracle_views.py (1 hunks)
  • ibis-server/test_timestamptz_issue.py (1 hunks)
  • ibis-server/tests/custom_sqlglot/conftest.py (1 hunks)
  • ibis-server/tests/custom_sqlglot/test_oracle_19c_date_arithmetic.py (1 hunks)
  • ibis-server/tests/custom_sqlglot/test_oracle_19c_integration.py (1 hunks)
  • ibis-server/tests/custom_sqlglot/test_oracle_19c_type_mapping.py (1 hunks)
  • ibis-server/tests/custom_sqlglot/test_oracle_pagination.py (1 hunks)
  • wren-core-legacy/docker/Dockerfile.build (1 hunks)
  • wren-core-legacy/docker/Dockerfile.multistage (1 hunks)
🧰 Additional context used
🧬 Code graph analysis (5)
ibis-server/tests/custom_sqlglot/test_oracle_19c_date_arithmetic.py (2)
ibis-server/app/custom_sqlglot/dialects/oracle.py (1)
  • Oracle (6-93)
ibis-server/tests/custom_sqlglot/conftest.py (1)
  • transpile_trino_to_oracle (40-55)
ibis-server/app/model/connector.py (1)
ibis-server/app/model/oracle_backend_patch.py (1)
  • patch_oracle_backend (105-118)
ibis-server/test_timestamptz_issue.py (1)
ibis-server/app/custom_sqlglot/dialects/oracle.py (1)
  • Oracle (6-93)
ibis-server/tests/custom_sqlglot/test_oracle_19c_integration.py (2)
ibis-server/app/custom_sqlglot/dialects/oracle.py (1)
  • Oracle (6-93)
ibis-server/tests/custom_sqlglot/conftest.py (3)
  • oracle_dialect (15-17)
  • transpile_trino_to_oracle (40-55)
  • oracle_type_mapping (27-29)
ibis-server/tests/custom_sqlglot/test_oracle_pagination.py (1)
ibis-server/app/custom_sqlglot/dialects/oracle.py (1)
  • Oracle (6-93)
🪛 markdownlint-cli2 (0.18.1)
ibis-server/docs/oracle_19c_dialect.md

74-74: Emphasis used instead of a heading

(MD036, no-emphasis-as-heading)


167-167: Strong style
Expected: asterisk; Actual: underscore

(MD050, strong-style)


167-167: Strong style
Expected: asterisk; Actual: underscore

(MD050, strong-style)

🔇 Additional comments (14)
ibis-server/pyproject.toml (1)

71-74: LGTM!

The new pytest markers for Oracle 19c testing are well-defined and follow the existing naming conventions. They provide clear descriptions of their purpose.

ibis-server/Dockerfile (1)

67-71: LGTM!

The environment variables for Oracle Instant Client are correctly configured. LD_LIBRARY_PATH properly appends to existing values, and ORACLE_HOME points to the correct installation directory.

ibis-server/app/custom_sqlglot/dialects/__init__.py (1)

3-4: LGTM - Clean dialect export.

The Oracle dialect import follows the existing pattern established by the MySQL import and properly exposes the custom Oracle dialect for use throughout the application.

ibis-server/tests/custom_sqlglot/test_oracle_19c_date_arithmetic.py (1)

1-172: Well-structured test suite with comprehensive coverage.

The test module provides excellent coverage of Oracle 19c date arithmetic transformations with:

  • Clear test organization and naming
  • Good use of pytest markers for selective test execution
  • Thorough edge case testing (month-end dates, WHERE clauses, mixed operations)
  • Helpful docstrings

The string assertion brittleness noted above is acceptable for dialect testing, though improvements would enhance maintainability.

ibis-server/app/model/connector.py (1)

37-39: Verify module-level patching approach.

Applying the Oracle backend patch at module import time creates a global side effect. Ensure this is intentional and won't cause issues when the module is imported in different contexts (e.g., testing, multiple connection types).

Consider whether the patch should be applied lazily when an Oracle connection is first created, or if there's a benefit to applying it globally at startup.

ibis-server/README.md (1)

84-98: Well-structured Oracle 19c documentation section.

The new documentation clearly explains:

  • Purpose of the custom Oracle 19c dialect
  • Link to detailed documentation
  • How to run the test suite

Good practice providing both an alias shortcut and the full command.

ibis-server/tests/custom_sqlglot/conftest.py (1)

1-55: Excellent test fixture implementation.

This conftest provides well-structured fixtures for Oracle dialect testing:

  • Clear separation between dialect, generator, and type mapping fixtures
  • Good documentation with usage examples
  • The transpile_trino_to_oracle helper fixture elegantly encapsulates transpilation logic
  • Proper isolation from application-level configuration

The fixtures support comprehensive testing while maintaining clean test code.

ibis-server/tests/custom_sqlglot/test_oracle_pagination.py (1)

13-97: Solid test coverage for Oracle pagination syntax.

The test class covers various pagination scenarios including LIMIT, OFFSET, JOINs, WHERE clauses, and edge cases. Good use of descriptive test names and docstrings.

ibis-server/tests/custom_sqlglot/test_oracle_19c_type_mapping.py (2)

111-126: Well-structured parametrized test for Oracle-specific types.

Good use of @pytest.mark.parametrize to test multiple type mappings with a single test function. The conditional assertion handles cases where types may not be present in the mapping.


16-29: Core type mapping tests look correct.

The tests properly verify the BOOLEAN → CHAR(1) mapping for Oracle 19c compatibility and use appropriate pytest markers for categorization.

ibis-server/app/model/oracle_backend_patch.py (1)

105-118: Monkey-patching approach is properly documented and timely.

The patch is called at module import time during router initialization, well before any Oracle connections are created. Implementation includes docstring, inline comments, and logging. This approach is justified for fixing the upstream ibis ORA-00923 bug and preserves the original backend reference for potential restoration.

ibis-server/app/custom_sqlglot/dialects/oracle.py (1)

1-42: Well-structured dialect extension with clear documentation.

The class design properly extends SQLGlot's Oracle dialect, and the TYPE_MAPPING overrides for BOOLEAN→CHAR(1) and TIMESTAMPTZ→TIMESTAMP are appropriate for Oracle 19c compatibility. The docstrings clearly explain the rationale for each override.

ibis-server/tests/custom_sqlglot/test_oracle_19c_integration.py (2)

1-21: Comprehensive integration test suite with good coverage of date arithmetic.

The test class is well-organized with clear docstrings explaining each test scenario. The coverage of date arithmetic in various SQL constructs (subqueries, CTEs, CASE expressions) is thorough.


76-248: Thorough test coverage for complex SQL patterns.

The remaining tests effectively validate Oracle 19c compatibility across a range of realistic scenarios including JOINs, aggregations, GROUP BY/HAVING, nested expressions, subqueries, and CTEs. The assertions appropriately verify both the presence of Oracle-compatible syntax and the absence of unsupported constructs.

Comment on lines +206 to +223
host: SecretStr | None = Field(
examples=["localhost"], description="the hostname of your database", default=None
)
port: SecretStr = Field(examples=[1521], description="the port of your database")
database: SecretStr = Field(
examples=["orcl"], description="the database name of your database"
port: SecretStr | None = Field(examples=[1521], description="the port of your database", default=None)
database: SecretStr | None = Field(
examples=["orcl"], description="the database name of your database", default=None
)
user: SecretStr = Field(
examples=["admin"], description="the username of your database"
)
password: SecretStr | None = Field(
examples=["password"], description="the password of your database", default=None
)
dsn: SecretStr | None = Field(
examples=["(DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=host)(PORT=port))(CONNECT_DATA=(SERVICE_NAME=service)))"],
description="Oracle Data Source Name (DSN) - Alternative to host/port/database configuration",
default=None
)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Missing validation: either DSN or host/port/database must be provided.

The current model allows all connection parameters (dsn, host, port, database) to be None simultaneously, which would cause AttributeError when accessing .get_secret_value() on None in data_source.py.

Consider adding a Pydantic model_validator to ensure at least one valid connection method is configured:

🔎 Proposed fix using Pydantic model validator
+from pydantic import model_validator

 class OracleConnectionInfo(BaseConnectionInfo):
     host: SecretStr | None = Field(
         examples=["localhost"], description="the hostname of your database", default=None
     )
     port: SecretStr | None = Field(examples=[1521], description="the port of your database", default=None)
     database: SecretStr | None = Field(
         examples=["orcl"], description="the database name of your database", default=None
     )
     user: SecretStr = Field(
         examples=["admin"], description="the username of your database"
     )
     password: SecretStr | None = Field(
         examples=["password"], description="the password of your database", default=None
     )
     dsn: SecretStr | None = Field(
         examples=["(DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=host)(PORT=port))(CONNECT_DATA=(SERVICE_NAME=service)))"],
         description="Oracle Data Source Name (DSN) - Alternative to host/port/database configuration",
         default=None
     )
+
+    @model_validator(mode='after')
+    def validate_connection_params(self):
+        if not self.dsn and not (self.host and self.port and self.database):
+            raise ValueError("Either 'dsn' or all of 'host', 'port', and 'database' must be provided")
+        return self

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents
In ibis-server/app/model/__init__.py around lines 206 to 223, the model
currently allows dsn, host, port and database to all be None which later causes
AttributeError when calling .get_secret_value(); add a Pydantic model_validator
(root/model-level validator) that checks: if dsn is not provided (None or empty)
then host, port and database must all be provided and non-null (and optionally
non-empty); if the check fails raise a ValueError with a clear message like
"Either dsn or host, port and database must be provided". Ensure the validator
runs after field parsing and returns the validated values unchanged when valid.

Comment on lines +80 to +84
import sys
print(f"🔍 CONNECTOR QUERY: {sql}", file=sys.stderr, flush=True)
print(f"🔍 LIMIT: {limit}", file=sys.stderr, flush=True)
logger.info(f"🔍 CONNECTOR QUERY: {sql}")
logger.info(f"🔍 LIMIT: {limit}")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Remove temporary debug logging before merging.

These debug print statements appear to be temporary debugging code that should be removed before production:

  • Redundant logging (both print to stderr AND logger.info)
  • Uses emoji (🔍) which may not render correctly in all environments
  • Verbose logging for every query could impact performance and clutter logs
Suggested cleanup
     @tracer.start_as_current_span("connector_query", kind=trace.SpanKind.CLIENT)
     def query(self, sql: str, limit: int) -> pd.DataFrame:
-        import sys
-        print(f"🔍 CONNECTOR QUERY: {sql}", file=sys.stderr, flush=True)
-        print(f"🔍 LIMIT: {limit}", file=sys.stderr, flush=True)
-        logger.info(f"🔍 CONNECTOR QUERY: {sql}")
-        logger.info(f"🔍 LIMIT: {limit}")
+        logger.debug(f"Executing query with limit {limit}")
         return self.connection.sql(sql).limit(limit).to_pandas()

If query logging is needed for debugging, consider:

  1. Using logger.debug() instead of logger.info() to avoid cluttering production logs
  2. Removing the emoji
  3. Removing redundant stderr prints
  4. Only logging the SQL at debug level if truly needed
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
import sys
print(f"🔍 CONNECTOR QUERY: {sql}", file=sys.stderr, flush=True)
print(f"🔍 LIMIT: {limit}", file=sys.stderr, flush=True)
logger.info(f"🔍 CONNECTOR QUERY: {sql}")
logger.info(f"🔍 LIMIT: {limit}")
logger.debug(f"Executing query with limit {limit}")
🤖 Prompt for AI Agents
In ibis-server/app/model/connector.py around lines 80 to 84, remove the
temporary debug prints to stderr and the emoji-laden info logs; replace them
with an optional debug-level logger call without emojis and without duplicate
printing. Specifically, delete the two print(..., file=sys.stderr) lines, change
logger.info to logger.debug, remove the emoji from the message, and only log the
SQL and limit at debug level (or remove entirely if not needed) so production
logs aren’t cluttered and performance isn’t impacted.

Comment on lines +193 to +199
else:
# Otherwise use individual host/port/database parameters
connect_params.update({
"host": info.host.get_secret_value(),
"port": int(info.port.get_secret_value()),
"database": info.database.get_secret_value(),
})
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Potential AttributeError when host, port, or database are None.

When dsn is not provided, the code assumes host, port, and database are valid SecretStr objects. However, since these fields are now optional (SecretStr | None) in OracleConnectionInfo, calling .get_secret_value() on None will raise an AttributeError.

🔎 Proposed fix with explicit validation
         # If DSN is provided, use it directly
         if info.dsn:
             connect_params["dsn"] = info.dsn.get_secret_value()
         else:
             # Otherwise use individual host/port/database parameters
+            if not info.host or not info.port or not info.database:
+                raise ValueError(
+                    "Either 'dsn' or all of 'host', 'port', and 'database' must be provided"
+                )
             connect_params.update({
                 "host": info.host.get_secret_value(),
                 "port": int(info.port.get_secret_value()),
                 "database": info.database.get_secret_value(),
             })
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
else:
# Otherwise use individual host/port/database parameters
connect_params.update({
"host": info.host.get_secret_value(),
"port": int(info.port.get_secret_value()),
"database": info.database.get_secret_value(),
})
else:
# Otherwise use individual host/port/database parameters
if not info.host or not info.port or not info.database:
raise ValueError(
"Either 'dsn' or all of 'host', 'port', and 'database' must be provided"
)
connect_params.update({
"host": info.host.get_secret_value(),
"port": int(info.port.get_secret_value()),
"database": info.database.get_secret_value(),
})
🤖 Prompt for AI Agents
In ibis-server/app/model/data_source.py around lines 193 to 199, the code calls
.get_secret_value() on info.host/info.port/info.database but those fields are
now Optional (SecretStr | None) and can be None; update the branch that builds
connect_params to first validate that host, port, and database are not None,
raising a clear ValueError (or custom exception) if any are missing, then call
.get_secret_value() and convert port to int; ensure the error message names the
missing field(s) so callers know which parameter is absent.

Comment on lines +40 to 70
sql = f"""
SELECT
t.owner AS TABLE_CATALOG,
t.owner AS TABLE_SCHEMA,
t.table_name AS TABLE_NAME,
v.owner AS TABLE_CATALOG,
v.owner AS TABLE_SCHEMA,
v.view_name AS TABLE_NAME,
c.column_name AS COLUMN_NAME,
c.data_type AS DATA_TYPE,
c.nullable AS IS_NULLABLE,
c.column_id AS ORDINAL_POSITION,
tc.comments AS TABLE_COMMENT,
cc.comments AS COLUMN_COMMENT
FROM
all_tables t
all_views v
JOIN
all_tab_columns c
ON t.owner = c.owner
AND t.table_name = c.table_name
ON v.owner = c.owner
AND v.view_name = c.table_name
LEFT JOIN
all_tab_comments tc
ON tc.owner = t.owner
AND tc.table_name = t.table_name
ON tc.owner = v.owner
AND tc.table_name = v.view_name
LEFT JOIN
all_col_comments cc
ON cc.owner = c.owner
AND cc.table_name = c.table_name
AND cc.column_name = c.column_name
WHERE
t.owner = 'SYSTEM'
v.owner = '{user}'
ORDER BY
t.table_name, c.column_id;
v.view_name, c.column_id;
"""
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

SQL injection risk with direct string interpolation.

The user variable is interpolated directly into the SQL query without sanitization. While this value comes from connection configuration, it still poses a SQL injection risk if the configuration is compromised or contains special characters.

Consider sanitizing the user value or using Oracle identifier quoting:

🔎 Proposed fix
         # Get dynamic user from connection info (not hardcoded 'SYSTEM')
-        user = self.connection_info.user.get_secret_value()
+        user = self.connection_info.user.get_secret_value().upper()
+        # Validate user contains only safe characters (alphanumeric, underscore)
+        import re
+        if not re.match(r'^[A-Z_][A-Z0-9_$#]*$', user):
+            raise ValueError(f"Invalid Oracle user identifier: {user}")

         sql = f"""

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents
In ibis-server/app/model/metadata/oracle.py around lines 40 to 70, the SQL
builds v.owner = '{user}' by direct string interpolation which creates a SQL
injection risk; change this to use a bound parameter or strict validation:
replace the inline '{user}' with a bind placeholder (e.g. :user) and pass the
connection bind params (user value) when executing the query, or if a bind is
not possible for this context, validate/normalize the user against a safe
pattern (e.g. /^[A-Z0-9_]+$/ and upper-case it) before interpolation to ensure
only valid Oracle identifier characters are used. Ensure any change keeps the
same behavior (owner comparison) and remove raw string interpolation.

Comment on lines +127 to +263
def get_constraints(self) -> List[Constraint]:
"""
res = (
self.connection.sql(sql, schema=schema)
.to_pandas()
.to_dict(orient="records")
)
Auto-detect view-to-view relationships using column naming patterns.

constraints = []
for row in res:
constraints.append(
Constraint(
constraintName=self._format_constraint_name(
row["TABLE_NAME"],
row["COLUMN_NAME"],
row["REFERENCED_TABLE_NAME"],
row["REFERENCED_COLUMN_NAME"],
),
constraintTable=self._format_compact_table_name(
row["TABLE_SCHEMA"], row["TABLE_NAME"]
),
constraintColumn=row["COLUMN_NAME"],
constraintedTable=self._format_compact_table_name(
row["REFERENCED_TABLE_SCHEMA"], row["REFERENCED_TABLE_NAME"]
),
constraintedColumn=row["REFERENCED_COLUMN_NAME"],
constraintType=ConstraintType.FOREIGN_KEY,
)
)
return constraints
TEMPORARILY DISABLED: Returning empty list to unblock development.
The column query against all_tab_columns appears to hang on large view sets.

TODO: Implement alternative approach:
- Use table metadata already retrieved in get_table_list()
- Cache column information to avoid repeated queries
- Or implement manual relationship definition UI

Returns:
Empty list (relationships disabled for now)
"""
return []

# ORIGINAL CODE COMMENTED OUT - WAS HANGING ON COLUMN QUERY
# constraints = []
# try:
# print(f"🔍 Starting constraint detection for user {user}...")
#
# # Step 1: Get list of views first (fast query)
# views_schema = ibis.schema({"VIEW_NAME": "string"})
# views_sql = f"""
# SELECT view_name AS VIEW_NAME
# FROM all_views
# WHERE owner = '{user}'
# """
# views_df = self.connection.sql(views_sql, schema=views_schema).to_pandas()
# view_list = views_df['VIEW_NAME'].tolist()
#
# print(f"🔍 Found {len(view_list)} views")
#
# if not view_list:
# return []
#
# # Step 2: Get columns for those views (single query with explicit list)
# columns_schema = ibis.schema({
# "TABLE_NAME": "string",
# "COLUMN_NAME": "string",
# "COLUMN_ID": "int64"
# })
#
# # Use IN clause with explicit list instead of subquery
# view_list_str = "', '".join(view_list)
# columns_sql = f"""
# SELECT
# table_name AS TABLE_NAME,
# column_name AS COLUMN_NAME,
# column_id AS COLUMN_ID
# FROM all_tab_columns
# WHERE owner = '{user}'
# AND table_name IN ('{view_list_str}')
# ORDER BY table_name, column_id
# """
#
# print(f"🔍 Querying columns for {len(view_list)} views...")
# columns_df = self.connection.sql(columns_sql, schema=columns_schema).to_pandas()
# print(f"🔍 Retrieved {len(columns_df)} total columns")
#
# if columns_df.empty:
# return []
#
# # Build index: view_name -> list of column names
# view_columns = {}
# for _, row in columns_df.iterrows():
# view = row['TABLE_NAME']
# col = row['COLUMN_NAME']
# if view not in view_columns:
# view_columns[view] = []
# view_columns[view].append(col)
#
# # Identify primary key columns per view
# # Pattern: column name ends with "Primary Key"
# # Build entity -> views map (which views have this entity as PK)
# entity_to_views = {} # entity name -> list of (view, pk_column)
# view_primary_keys = {} # view -> list of PK columns
#
# print(f"🔍 Analyzing {len(view_columns)} views for Primary Key columns...")
#
# for view, columns in view_columns.items():
# pks = [col for col in columns if col.endswith('Primary Key')]
# if pks:
# view_primary_keys[view] = pks
# # Map each entity to the views that have it as PK
# for pk in pks:
# entity = pk.replace(' Primary Key', '')
# if entity not in entity_to_views:
# entity_to_views[entity] = []
# entity_to_views[entity].append((view, pk))
#
# print(f"🔍 Found {len(view_primary_keys)} views with Primary Key columns")
# print(f"🔍 Identified {len(entity_to_views)} distinct entities: {list(entity_to_views.keys())[:10]}...")
#
# # Detect relationships by matching column names
# for source_view, source_cols in view_columns.items():
# # Find foreign key candidates (columns ending with "Primary Key")
# fk_candidates = [col for col in source_cols if col.endswith('Primary Key')]
#
# for fk_col in fk_candidates:
# # Extract entity name from FK column
# entity = fk_col.replace(' Primary Key', '')
#
# # Find views that have this entity as their primary key
# if entity in entity_to_views:
# for target_view, target_pk_col in entity_to_views[entity]:
# if source_view == target_view:
# # Skip self-references (e.g., Part Primary Key in RT Parts table itself)
# continue
#
# # Create relationship!
# constraint_name = f"FK_{source_view}_{target_view}_{entity}".replace(" ", "_")
#
# constraints.append(
# Constraint(
# constraintName=constraint_name[:128], # Oracle name limit
# constraintTable=self._format_compact_table_name(user, source_view),
# constraintColumn=fk_col,
# constraintedTable=self._format_compact_table_name(user, target_view),
# constraintedColumn=target_pk_col,
# constraintType=ConstraintType.FOREIGN_KEY,
# )
# )
#
# print(f"✅ Detected {len(constraints)} view relationships using column naming patterns")
# if constraints:
# print(f"📊 Sample relationships:")
# for c in constraints[:5]:
# print(f" - {c.constraintTable}.{c.constraintColumn} -> {c.constraintedTable}.{c.constraintedColumn}")
# return constraints
#
# except Exception as e:
# # If auto-detection fails, return empty list rather than breaking WrenAI
# # This ensures the system remains functional even with permission issues
# print(f"Warning: Could not auto-detect view relationships: {e}")
# return []
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major

Remove large block of commented-out code.

The commented-out implementation spans ~100 lines and creates maintenance burden. Consider removing it and tracking the TODO in a GitHub issue for proper visibility and tracking.

🔎 Proposed refactor
     def get_constraints(self) -> list[Constraint]:
         """
-        Auto-detect view-to-view relationships using column naming patterns.
-
-        TEMPORARILY DISABLED: Returning empty list to unblock development.
-        The column query against all_tab_columns appears to hang on large view sets.
-
-        TODO: Implement alternative approach:
-        - Use table metadata already retrieved in get_table_list()
-        - Cache column information to avoid repeated queries
-        - Or implement manual relationship definition UI
-
-        Returns:
-            Empty list (relationships disabled for now)
+        Return empty list - constraint auto-detection temporarily disabled.
+        
+        See: https://github.com/Canner/wren-engine/issues/XXX
         """
         return []
-
-        # ORIGINAL CODE COMMENTED OUT - WAS HANGING ON COLUMN QUERY
-        # ... (remove all ~100 lines of commented code)

Would you like me to open a GitHub issue to track the constraint auto-detection feature?

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents
In ibis-server/app/model/metadata/oracle.py around lines 127-263 there is a
~100-line commented-out implementation of get_constraints that should be removed
to reduce maintenance clutter; delete the entire commented block (everything
after the current early return) and replace it with a single-line TODO comment
that references a new GitHub issue tracking re-implementation (create the issue
first, then put its number or URL in the TODO), update the method docstring to
mention "see TODO/issue #<id>" for follow-up, and add a concise git commit
message like "chore: remove large commented-out get_constraints implementation;
track work in issue #<id>".

Comment on lines +87 to +93
print("📋 Test 4: Verify views-only (no tables)")
print("-" * 80)
# Check if any discovered objects are tables (they shouldn't be)
view_count = len(views)
print(f"✅ All {view_count} objects are views")
print(f" (No tables discovered - views-only architecture confirmed)")
print()
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Test 4 doesn't actually verify views-only behavior.

The test claims to verify that "all objects are views (no tables)" but it only checks the count of views returned from get_table_list(). There's no actual validation that these are views and not tables.

🔎 Consider adding actual verification
     print("📋 Test 4: Verify views-only (no tables)")
     print("-" * 80)
-    # Check if any discovered objects are tables (they shouldn't be)
-    view_count = len(views)
-    print(f"✅ All {view_count} objects are views")
-    print(f"   (No tables discovered - views-only architecture confirmed)")
+    # Verify returned objects have view-like properties or query the database
+    # to confirm object types if metadata is available
+    view_count = len(views)
+    # TODO: Add actual verification that objects are views, not tables
+    print(f"✅ Discovered {view_count} objects (assuming views based on query source)")
+    print(f"   (Views-only query used in get_table_list)")
     print()
🤖 Prompt for AI Agents
In ibis-server/test_oracle_views.py around lines 87 to 93, the test only checks
the length of views but does not assert that each discovered object is actually
a VIEW rather than a TABLE; update the test to iterate the returned objects and
validate their type: either use the metadata returned by get_table_list (e.g., a
'type' or 'kind' field) and assert it equals 'VIEW' for every entry, or call the
DB inspection API/SQL (e.g., SQLAlchemy inspector.get_view_names() or querying
information_schema/tables WHERE table_type='VIEW' for Oracle-equivalent) and
fail the test if any table is found; adjust the printed messages to reflect the
actual verification and raise an assertion/error when a non-view is detected.

Comment on lines +23 to +29
RUN \
apt update && \
apt -y install curl gpg lsb-release && \
curl -fsSL https://www.postgresql.org/media/keys/ACCC4CF8.asc | gpg --dearmor -o /etc/apt/trusted.gpg.d/postgresql.gpg && \
echo "deb http://apt.postgresql.org/pub/repos/apt/ `lsb_release -cs`-pgdg main" | tee /etc/apt/sources.list.d/pgdg.list && \
apt update && \
apt -y install postgresql-client-13
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Missing Oracle Instant Client installation for Oracle ADB integration.

This PR introduces Oracle 19c database support, but neither the builder nor runtime stages install the Oracle Instant Client libraries required for Oracle connectivity. The AI summary explicitly mentions "Docker environment updates with Oracle Instant Client" as a key feature. Without these libraries, the application cannot connect to Oracle databases.

🔎 Proposed fix to install Oracle Instant Client
 RUN \
     apt update && \
     apt -y install curl gpg lsb-release && \
     curl -fsSL https://www.postgresql.org/media/keys/ACCC4CF8.asc | gpg --dearmor -o /etc/apt/trusted.gpg.d/postgresql.gpg && \
     echo "deb http://apt.postgresql.org/pub/repos/apt/ `lsb_release -cs`-pgdg main" | tee /etc/apt/sources.list.d/pgdg.list && \
     apt update && \
-    apt -y install postgresql-client-13
+    apt -y install postgresql-client-13 && \
+    apt -y install wget unzip libaio1 && \
+    wget https://download.oracle.com/otn_software/linux/instantclient/1923000/instantclient-basic-linux.x64-19.23.0.0.0dbru.zip -O /tmp/oracle.zip && \
+    unzip /tmp/oracle.zip -d /opt/oracle && \
+    rm /tmp/oracle.zip && \
+    sh -c "echo /opt/oracle/instantclient_19_23 > /etc/ld.so.conf.d/oracle-instantclient.conf" && \
+    ldconfig

Alternatively, you may need to set environment variables:

ENV LD_LIBRARY_PATH=/opt/oracle/instantclient_19_23:$LD_LIBRARY_PATH
ENV PATH=/opt/oracle/instantclient_19_23:$PATH
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
RUN \
apt update && \
apt -y install curl gpg lsb-release && \
curl -fsSL https://www.postgresql.org/media/keys/ACCC4CF8.asc | gpg --dearmor -o /etc/apt/trusted.gpg.d/postgresql.gpg && \
echo "deb http://apt.postgresql.org/pub/repos/apt/ `lsb_release -cs`-pgdg main" | tee /etc/apt/sources.list.d/pgdg.list && \
apt update && \
apt -y install postgresql-client-13
RUN \
apt update && \
apt -y install curl gpg lsb-release && \
curl -fsSL https://www.postgresql.org/media/keys/ACCC4CF8.asc | gpg --dearmor -o /etc/apt/trusted.gpg.d/postgresql.gpg && \
echo "deb http://apt.postgresql.org/pub/repos/apt/ `lsb_release -cs`-pgdg main" | tee /etc/apt/sources.list.d/pgdg.list && \
apt update && \
apt -y install postgresql-client-13 && \
apt -y install wget unzip libaio1 && \
wget https://download.oracle.com/otn_software/linux/instantclient/1923000/instantclient-basic-linux.x64-19.23.0.0.0dbru.zip -O /tmp/oracle.zip && \
unzip /tmp/oracle.zip -d /opt/oracle && \
rm /tmp/oracle.zip && \
sh -c "echo /opt/oracle/instantclient_19_23 > /etc/ld.so.conf.d/oracle-instantclient.conf" && \
ldconfig

Comment on lines +7 to +14
COPY ../wren-core-legacy/.mvn .mvn
COPY ../wren-core-legacy/mvnw .
COPY ../wren-core-legacy/pom.xml .
COPY ../wren-core-legacy/wren-base ./wren-base
COPY ../wren-core-legacy/wren-main ./wren-main
COPY ../wren-core-legacy/wren-server ./wren-server
COPY ../wren-core-legacy/wren-tests ./wren-tests
COPY ../wren-core-legacy/trino-parser ./trino-parser
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

COPY paths assume incorrect parent directory context.

The COPY commands reference ../wren-core-legacy/, which assumes the Docker build context is set to the parent directory of wren-core-legacy. However, based on standard Docker practices and the location of this Dockerfile, the build context is likely set to the wren-core-legacy directory itself, which would cause all these COPY operations to fail.

🔎 Proposed fix to correct COPY paths
 # Copy Maven wrapper and pom files
-COPY ../wren-core-legacy/.mvn .mvn
-COPY ../wren-core-legacy/mvnw .
-COPY ../wren-core-legacy/pom.xml .
-COPY ../wren-core-legacy/wren-base ./wren-base
-COPY ../wren-core-legacy/wren-main ./wren-main
-COPY ../wren-core-legacy/wren-server ./wren-server
-COPY ../wren-core-legacy/wren-tests ./wren-tests
-COPY ../wren-core-legacy/trino-parser ./trino-parser
+COPY .mvn .mvn
+COPY mvnw .
+COPY pom.xml .
+COPY wren-base ./wren-base
+COPY wren-main ./wren-main
+COPY wren-server ./wren-server
+COPY wren-tests ./wren-tests
+COPY trino-parser ./trino-parser
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
COPY ../wren-core-legacy/.mvn .mvn
COPY ../wren-core-legacy/mvnw .
COPY ../wren-core-legacy/pom.xml .
COPY ../wren-core-legacy/wren-base ./wren-base
COPY ../wren-core-legacy/wren-main ./wren-main
COPY ../wren-core-legacy/wren-server ./wren-server
COPY ../wren-core-legacy/wren-tests ./wren-tests
COPY ../wren-core-legacy/trino-parser ./trino-parser
COPY .mvn .mvn
COPY mvnw .
COPY pom.xml .
COPY wren-base ./wren-base
COPY wren-main ./wren-main
COPY wren-server ./wren-server
COPY wren-tests ./wren-tests
COPY trino-parser ./trino-parser
🤖 Prompt for AI Agents
In wren-core-legacy/docker/Dockerfile.multistage around lines 7 to 14, the COPY
instructions use ../wren-core-legacy/* which assumes build context is the parent
directory; update each COPY to use paths relative to the current build context
(remove the leading ../wren-core-legacy/), e.g. COPY .mvn .mvn, COPY mvnw .,
COPY pom.xml ., and COPY the subdirectories directly (wren-base, wren-main,
wren-server, wren-tests, trino-parser) so the files are found during docker
build; if you intentionally build from the parent, alternatively adjust the
docker build context/command to point to the parent directory instead of
changing the Dockerfile.

Comment on lines +28 to +34
RUN \
apt update && \
apt -y install curl gpg lsb-release && \
curl -fsSL https://www.postgresql.org/media/keys/ACCC4CF8.asc | gpg --dearmor -o /etc/apt/trusted.gpg.d/postgresql.gpg && \
echo "deb http://apt.postgresql.org/pub/repos/apt/ `lsb_release -cs`-pgdg main" | tee /etc/apt/sources.list.d/pgdg.list && \
apt update && \
apt -y install postgresql-client-13
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Missing Oracle Instant Client installation for Oracle ADB integration.

This PR introduces Oracle 19c database support, but neither the builder nor runtime stages install the Oracle Instant Client libraries required for Oracle connectivity. The AI summary explicitly mentions "Docker environment updates with Oracle Instant Client" as a key feature. Without these libraries, the application cannot connect to Oracle databases.

🔎 Proposed fix to install Oracle Instant Client
 RUN \
     apt update && \
     apt -y install curl gpg lsb-release && \
     curl -fsSL https://www.postgresql.org/media/keys/ACCC4CF8.asc | gpg --dearmor -o /etc/apt/trusted.gpg.d/postgresql.gpg && \
     echo "deb http://apt.postgresql.org/pub/repos/apt/ `lsb_release -cs`-pgdg main" | tee /etc/apt/sources.list.d/pgdg.list && \
     apt update && \
-    apt -y install postgresql-client-13
+    apt -y install postgresql-client-13 && \
+    apt -y install wget unzip libaio1 && \
+    wget https://download.oracle.com/otn_software/linux/instantclient/1923000/instantclient-basic-linux.x64-19.23.0.0.0dbru.zip -O /tmp/oracle.zip && \
+    unzip /tmp/oracle.zip -d /opt/oracle && \
+    rm /tmp/oracle.zip && \
+    sh -c "echo /opt/oracle/instantclient_19_23 > /etc/ld.so.conf.d/oracle-instantclient.conf" && \
+    ldconfig

Alternatively, you may need to set environment variables:

ENV LD_LIBRARY_PATH=/opt/oracle/instantclient_19_23:$LD_LIBRARY_PATH
ENV PATH=/opt/oracle/instantclient_19_23:$PATH
🤖 Prompt for AI Agents
In wren-core-legacy/docker/Dockerfile.multistage around lines 28-34, the
Dockerfile installs PostgreSQL client but omits Oracle Instant Client required
for Oracle ADB; add steps in both the builder and runtime stages to download and
install the appropriate Oracle Instant Client RPMs or ZIPs (e.g., basic and sdk
for 19c/19.23), install required dependencies (libaio, unzip or alien if
converting RPMs), extract or install them into /opt/oracle/instantclient_19_xx,
set ENV LD_LIBRARY_PATH and PATH to include that directory, and ensure file
permissions and symlinks (libclntsh.so) are correct so the app can load the
Oracle libraries at runtime.

# Copy the built jar from builder stage
COPY --from=builder /wren-server-executable.jar ./wren-server-executable.jar

COPY entrypoint.sh ./
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash

# Find all entrypoint.sh files in the repository
echo "=== Finding entrypoint.sh files ==="
fd -t f "entrypoint.sh" wren-core-legacy/ 2>/dev/null || echo "No entrypoint.sh found in wren-core-legacy/"

# List docker files in wren-core-legacy/docker/
echo -e "\n=== Docker files in wren-core-legacy/docker/ ==="
fd -t f "Dockerfile" wren-core-legacy/docker/ 2>/dev/null || echo "No Dockerfile found"

# Check if Dockerfile.multistage exists
echo -e "\n=== Checking Dockerfile.multistage ==="
if [ -f "wren-core-legacy/docker/Dockerfile.multistage" ]; then
  echo "File exists"
  wc -l "wren-core-legacy/docker/Dockerfile.multistage"
else
  echo "File does not exist"
fi

# Check if Dockerfile.build exists
echo -e "\n=== Checking Dockerfile.build ==="
if [ -f "wren-core-legacy/docker/Dockerfile.build" ]; then
  echo "File exists"
  wc -l "wren-core-legacy/docker/Dockerfile.build"
else
  echo "File does not exist"
fi

Repository: Canner/wren-engine

Length of output: 499


🏁 Script executed:

#!/bin/bash

echo "=== Dockerfile.multistage (full content with line numbers) ==="
cat -n wren-core-legacy/docker/Dockerfile.multistage

echo -e "\n=== Dockerfile.build (full content with line numbers) ==="
cat -n wren-core-legacy/docker/Dockerfile.build

Repository: Canner/wren-engine

Length of output: 3752


🏁 Script executed:

#!/bin/bash

# Search for docker build commands or references to these Dockerfiles
echo "=== Searching for docker build commands ==="
fd -t f "\(Makefile\|\.yaml\|\.yml\|docker-compose\|\.sh\)" wren-core-legacy/ | head -20 | while read file; do
  if grep -l "Dockerfile.multistage\|Dockerfile.build\|docker build" "$file" 2>/dev/null; then
    echo "Found in: $file"
  fi
done

# Check root directory
echo -e "\n=== Searching in root directory ==="
fd -t f "\(Makefile\|docker-compose\|\.github\)" . -d 2 | while read file; do
  if grep -l "Dockerfile.multistage\|Dockerfile.build\|docker build" "$file" 2>/dev/null; then
    echo "Found in: $file"
  fi
done

# Check for GitHub Actions workflows
echo -e "\n=== GitHub Actions workflows ==="
fd -t f "\.yml\|\.yaml" .github/workflows/ 2>/dev/null | while read file; do
  if grep -l "docker\|Dockerfile" "$file" 2>/dev/null; then
    echo "Found in: $file"
  fi
done

Repository: Canner/wren-engine

Length of output: 176


Fix entrypoint script path in Dockerfile.multistage (line 39).

The path entrypoint.sh is incorrect. Based on the build context pattern used in earlier COPY commands (e.g., ../wren-core-legacy/.mvn), the correct path should be ../wren-core-legacy/docker/entrypoint.sh or wren-core-legacy/docker/entrypoint.sh. The current path would cause the Docker build to fail with "file not found" since the entrypoint script is located in wren-core-legacy/docker/entrypoint.sh, not at the root of the build context.

🤖 Prompt for AI Agents
In wren-core-legacy/docker/Dockerfile.multistage around line 39, the COPY uses
an incorrect relative path "entrypoint.sh" which causes Docker build to fail
because the script lives in wren-core-legacy/docker; update the COPY source to
the correct path (e.g., ../wren-core-legacy/docker/entrypoint.sh or
wren-core-legacy/docker/entrypoint.sh to match the earlier COPY patterns) so the
file is present in the build context and copied into the image.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

dependencies Pull requests that update a dependency file documentation Improvements or additions to documentation ibis python Pull requests that update Python code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant