Skip to content

Conversation

@jdunkerley
Copy link
Member

Pull Request Description

  • Initial set up for DuckDB
  • Add connection and scaffolding
  • Using Postgres dialect.

Next PR will add a dialect and bring into unit tests.

Checklist

Please ensure that the following checklist has been satisfied before submitting the PR:

  • The documentation has been updated, if necessary.
  • Screenshots/screencasts have been attached, if there are any visual changes. For interactive or animated visual changes, a screencast is preferred.
  • All code follows the
    Scala,
    Java,
    TypeScript,
    and
    Rust
    style guides. In case you are using a language not listed above, follow the Rust style guide.
  • Unit tests have been written where possible.
  • If meaningful changes were made to logic or tests affecting Enso Cloud integration in the libraries,
    or the Snowflake database integration, a run of the Extra Tests has been scheduled.
    • If applicable, it is suggested to paste a link to a successful run of the Extra Tests.

@jdunkerley jdunkerley added the CI: No changelog needed Do not require a changelog entry for this PR. label Oct 6, 2025
@github-actions github-actions bot added -libs-API-change-Database Marks a PR that changes the public API of Standard.Database -libs-API-change-Microsoft Marks a PR that changes the public API of Standard.Microsoft labels Oct 6, 2025
@JaroslavTulach
Copy link
Member

  • why don't we create Standard.Duck instead of putting new 3rd party library into Standard.Database?
  • At least I believe it has been agreed that we want to move SQLite and Posgress out of Standard.Database eventually
  • why not use proper architecture since the beginning in the case of DuckDB?
  • or do we believe DuckDB is so essential it has to be in the core of all databases?

@jdunkerley
Copy link
Member Author

jdunkerley commented Oct 7, 2025

  • why don't we create Standard.Duck instead of putting new 3rd party library into Standard.Database?
  • At least I believe it has been agreed that we want to move SQLite and Posgress out of Standard.Database eventually
  • why not use proper architecture since the beginning in the case of DuckDB?
  • or do we believe DuckDB is so essential it has to be in the core of all databases?

At this point in testing it, DuckDB is expected to be very core to deliver some of the base functionality, like spatial and some reading and writing of files. If however it doesn't pan out to be useful, I will isolate to a separate library once have gone further along the path.

@JaroslavTulach
Copy link
Member

JaroslavTulach commented Oct 7, 2025

deliver some of the base functionality, like spatial

The fact that DuckDB is believed to be essential from product point of view doesn't mean we need to compromise the architecture. It'd be better if none of the core Table&Database concepts (like spatial) were tight to particular implementation.

@JaroslavTulach
Copy link
Member

JaroslavTulach commented Oct 7, 2025

  • why don't we create Standard.Duck instead of putting new 3rd party library into Standard.Database?

@jdunkerley
Copy link
Member Author

At least how I envision using DuckDB this will be in the NI space for a period of time.

I will put it as its own library and add to the standard list of import - assuming can do quickly as need to move on this.

@jdunkerley jdunkerley force-pushed the wip/jd/duckdb-create branch from 31f1fa4 to e8660f8 Compare October 7, 2025 12:24
@jdunkerley jdunkerley requested a review from Frizi as a code owner October 7, 2025 12:24
@JaroslavTulach JaroslavTulach added the CI: Clean build required CI runners will be cleaned before and after this PR is built. label Oct 7, 2025
@github-actions github-actions bot added the -libs-API-change-DuckDB Marks a PR that changes the public API of Standard.DuckDB label Oct 7, 2025
Copy link
Member

@Akirathan Akirathan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose adding DuckDB to NI is OK. I am curious if you will be able to pass the NI size check though.

.listFiles("*.jar")
.map(_.getAbsolutePath()) ++
`std-tableau-polyglot-root`
.listFiles("*.jar")
Copy link
Member

@JaroslavTulach JaroslavTulach Oct 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • I would really prefer if we could go without these three lines
  • it is an excellent opportunity to make dual JVM mode work smoothly
  • I promise to address any inquiries ASAP
  • btw. try the experience execute:
sbt:enso> runEngineDistribution --jvm 
  --run test/DuckDB_Tests 
  --vm.D=polyglot.enso.classLoading=Standard.DuckDB:guest,enso_dev.DuckDB_Test:guest,hosted

4 tests succeeded.
0 tests failed.
0 tests skipped.
0 groups skipped.
  • does it work? Yes, then let's try it. No, tell me what's failing!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test code:

from Standard.Base import all
from Standard.Table import all
from Standard.Database import all
from Standard.DuckDB import all

main =
    connection5 = Database.connect DuckDB.In_Memory
    table4 = connection5.execute_query 'SELECT * FROM "/mnt/c/Users/jdunk/Downloads/flights-1m.parquet"' ..All_Rows write_operation=False
    any10 = table4.aggregate columns=[..Count, ..Maximum 'ARR_DELAY']
    any10.display

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"libduckdb_java.so_osx_universal" -> PolyglotLib(MacOSAMD64),
"libduckdb_java.so_windows_amd64" -> PolyglotLib(WindowsAMD64),
"META-INF/**" -> CopyToOutputJar,
"org/**/*.class" -> CopyToOutputJar
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am getting:

sbt:enso> runEngineDistribution --jvm --run test.enso

Execution finished with an error: Can't load library: /enso/build/debug/libduckdb_java.so_linux_amd64
        at <java> java.base/java.lang.ClassLoader.loadLibrary(ClassLoader.java:2249)
        at <java> java.base/java.lang.Runtime.load0(Runtime.java:767)
        at <java> java.base/java.lang.System.load(System.java:1624)
        at <java> org.duckdb.DuckDBNative.<clinit>(DuckDBNative.java:52)
        at <java> org.duckdb.DuckDBConnection.newConnection(DuckDBConnection.java:62)
        at <java> org.duckdb.DuckDBDriver.connect(DuckDBDriver.java:114)
        at <java> java.sql/java.sql.DriverManager.getConnection(DriverManager.java:613)
        at <java> java.sql/java.sql.DriverManager.getConnection(DriverManager.java:160)
        at <java> org.enso.base.enso_cloud.EnsoSecretHelper.getJDBCConnection(EnsoSecretHelper.java:51)

where the test.enso. We need to patch DuckDBNative.java (upstream as well, please) to use System.loadLibrary...

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, CI reports the same (it is masked by ClassNotFoundException)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

*.pdb
*.so
*.jar
*.so_linux_amd64
Copy link
Member

@JaroslavTulach JaroslavTulach Oct 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • This is wrong idea.
  • We need .so extension otherwise System.loadLibrary("duckdb") will not find it
  • but we need to modify DuckDBNative
  • to call System.loadLibrary first before trying the equilibristic with extracting to File.createTemp

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Logic in DuckDBNative appears to have been added to support local dev. I agree that logic is rather flawed if one wants to use the jar as a 3rd party dependency.

jdbc_connection = JDBC_Connection.from_java sql_exception=SQLException java_connection

## For initial work using DuckDB, we will use the Postgres dialect.
DuckDB_Connection.Value (Connection.new jdbc_connection Dialect.postgres DuckDB_Entity_Naming_Properties.new) make_new
Copy link
Member

@JaroslavTulach JaroslavTulach Oct 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • using Dialect.postgres doesn't allow us to use the same trick as Standard.Microsoft does
  • there is SQLServer_Type_Mapping.column_fetcher_factory which loads a class from other JVM
  • if we did the same for DuckDB, then following worked:
sbt:enso> runEngineDistribution --jvm --run test.enso --vm.D=polyglot.enso.classLoading=Standard.DuckDB:guest,hosted
Execution finished with an error: Cannot convert '2006-01-01'(language: Enso Polyglot Bridge, type: java.sql.Date) to Java type 'java.sql.Date': Unsupported target type.
        at <java> com.oracle.truffle.polyglot.PolyglotObjectProxyHandler.invoke(PolyglotObjectProxyHandler.java:112)
        at <java> jdk.proxy3/jdk.proxy3.$Proxy46.getDate(Unknown Source)
        at <java> org.enso.database.JDBCUtils.getLocalDate(JDBCUtils.java:35)
        at <java> org.enso.database.fetchers.ColumnFetcherFactory$DefaultColumnFetcherFactory$5.getValue(ColumnFetcherFactory.java:72)
        at <java> org.enso.database.fetchers.BaseColumnFetcher.append(BaseColumnFetcher.java:51)
        at <java> org.enso.database.fetchers.ColumnFetcher.readResultSet(ColumnFetcher.java:51)
        at <enso> Result_Set.result_set_to_table.aggregator<arg-2>(Result_Set.enso:38:13-126)
        at <enso> Result_Set.result_set_to_table.aggregator(Result_Set.enso:37-38)
        at <enso> Java_Problems.with_problem_aggregator(Java_Problems.enso:116:14-25)
        at <enso> Result_Set.result_set_to_table(Result_Set.enso:35-39)
        at <enso> case_branch.statement_setter(Connection.enso:404:25-143)
  • where test.enso is exactly the snippet being executed

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will be working on a new dialect for DuckDB next

@jdunkerley jdunkerley force-pushed the wip/jd/duckdb-create branch from 2452d0e to f90b56e Compare October 9, 2025 15:39
@enso-bot enso-bot bot mentioned this pull request Oct 10, 2025
4 tasks
DuckDBNative needed to be patched so that it can be loaded as a shared
library by name in Native Image.
Additionally, it looks like `duckdb_java` didn't follow any standards
when it comes to naming the library name:
- always prefixed with `lib` (not acceptable on Windows)
- suffix includes OS and arch (should be one of `.so`, `.dll` or
  `.dylib`)
static {
var libName = "duckdb_java";
try {
System.loadLibrary(libName);
Copy link
Member

@JaroslavTulach JaroslavTulach Oct 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd try to push this System.loadLibrary("duckdb_java") upstream to see how open the project is to 3rd party contributions.

@jdunkerley jdunkerley added the CI: Ready to merge This PR is eligible for automatic merge label Oct 12, 2025
@mergify mergify bot merged commit a20924d into develop Oct 12, 2025
81 checks passed
@mergify mergify bot deleted the wip/jd/duckdb-create branch October 12, 2025 16:07
@unfurl-links unfurl-links bot mentioned this pull request Oct 13, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

-libs-API-change-Database Marks a PR that changes the public API of Standard.Database -libs-API-change-DuckDB Marks a PR that changes the public API of Standard.DuckDB -libs-API-change-Microsoft Marks a PR that changes the public API of Standard.Microsoft CI: Clean build required CI runners will be cleaned before and after this PR is built. CI: No changelog needed Do not require a changelog entry for this PR. CI: Ready to merge This PR is eligible for automatic merge

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants