Skip to content

Conversation

@amotl
Copy link
Member

@amotl amotl commented Sep 15, 2025

About

Continue adding tutorials from the community forum.

Preview

References

@amotl amotl added the reorganize Moving content around, inside and between other systems. label Sep 15, 2025
@coderabbitai
Copy link

coderabbitai bot commented Sep 15, 2025

Walkthrough

Adds Trino integration documentation: updates docs/integrate/index.md to list Trino, adds docs/integrate/trino/index.md overview and hidden toctree, introduces docs/integrate/trino/tutorial.md with PostgreSQL-connector configuration, CLI usage, and caveats. Also updates cross-reference targets in docs/feature/query/index.md.

Changes

Cohort / File(s) Summary
Integrations index updates
docs/integrate/index.md
Inserts trino/index into two toctrees so Trino appears in the integrations list.
New Trino integration overview
docs/integrate/trino/index.md
Adds Trino landing page with logo, About and Learn sections, a link card to the tutorial, and a hidden toctree referencing the tutorial.
Trino tutorial
docs/integrate/trino/tutorial.md
Adds "Connecting to CrateDB in Trino" tutorial: PostgreSQL-connector postgresql.properties example, CLI usage instructions, and caveats on quoting, schema addressing, data types, and pushdown/compatibility.
Query docs cross-reference fixes
docs/feature/query/index.md
Updates visible labels and rewrites local anchors to project-prefixed cross-reference targets in duplicated blocks (label and xref changes only).

Sequence Diagram(s)

sequenceDiagram
    rect rgb(230,240,255)
    participant User as Developer (CLI)
    participant Trino as Trino Coordinator
    participant Connector as PostgreSQL Connector
    participant Crate as CrateDB
    end

    User->>Trino: trino-cli -> execute query
    Trino->>Connector: route query via catalog (postgresql.properties)
    Connector->>Crate: SQL over PostgreSQL protocol (no DB name)
    Crate-->>Connector: results (note: object quoting/type mappings)
    Connector-->>Trino: results (may limit pushdown)
    Trino-->>User: query results (with possible performance caveats)
Loading

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Possibly related PRs

Suggested labels

new content, guidance

Suggested reviewers

  • hammerhead
  • kneth
  • surister

Poem

A rabbit hopped through docs so bright,
Planted Trino by the lantern light.
Connector seeds in tidy rows,
CLI carrots and config prose.
Hop — the integration grows! 🐇✨

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Title Check ✅ Passed The title "Trino: Index page and starter tutorial" succinctly and accurately summarizes the main changes in this PR, which add a Trino integration index page and a starter tutorial (docs/integrate/trino/index.md and docs/integrate/trino/tutorial.md); it is concise, specific, and relevant to the changeset, and minor unrelated doc label edits do not undermine the title's accuracy.
Description Check ✅ Passed The PR description states the intent to continue adding community tutorials, provides a preview link, and references the related issue (GH-102), which directly corresponds to the added Trino docs and tutorial, so it is on-topic and acceptable under the lenient description criteria.
Docstring Coverage ✅ Passed No functions found in the changes. Docstring coverage check skipped.
✨ Finishing touches
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch trino

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 4e1a5d8 and 60f22fc.

📒 Files selected for processing (4)
  • docs/feature/query/index.md (7 hunks)
  • docs/integrate/index.md (1 hunks)
  • docs/integrate/trino/index.md (1 hunks)
  • docs/integrate/trino/tutorial.md (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (4)
  • docs/integrate/trino/tutorial.md
  • docs/integrate/trino/index.md
  • docs/integrate/index.md
  • docs/feature/query/index.md
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Build docs

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@amotl
Copy link
Member Author

amotl commented Sep 15, 2025

@coderabbitai review and please refactor to active voice.

@coderabbitai
Copy link

coderabbitai bot commented Sep 15, 2025

@amotl I'll review the PR and help refactor the content to use active voice as requested. Let me start by reviewing the changes.


🧠 Learnings used
Learnt from: amotl
PR: crate/cratedb-guide#0
File: :0-0
Timestamp: 2025-09-15T17:42:25.449Z
Learning: When reviewing documentation, use active voice instead of first-person or passive voice, and emit corresponding suggestions.
✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@amotl amotl marked this pull request as ready for review September 16, 2025 14:46
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (8)
docs/integrate/trino/tutorial.md (6)

4-4: Refactor to active voice and consistent second person.

Tighten phrasing and avoid first-person singular. Suggested edits:

- [Trino](https://trino.io/) (formerly known as Presto SQL) is a distributed query engine, that allows running analytical queries across different data sources via SQL. One of those data sources can be CrateDB and this article is going to look at how to configure the connection.
+ [Trino](https://trino.io/) is a distributed SQL query engine. This tutorial shows how to configure Trino to connect to CrateDB.
- We assume a Trino client/server installation is already in place as per [Trino’s installation instructions](https://trino.io/docs/current/installation.html).
+ Assume you have a Trino client/server installation as per the [installation instructions](https://trino.io/docs/current/installation.html).
- For this post, I installed Trino on macOS using Homebrew with `brew install trino` and my installation directory is `/usr/local/Cellar/trino/375`. Depending on your installation method, there might be different ways to start the Trino server. For the sake of this post, I start it in my console from the installation directory with the command `./bin/trino-server run`. Your preferred way of starting might differ.
+ For example, on macOS you can `brew install trino`. Start the server with `trino-server run` from your installation’s `bin` directory. Depending on your installation, the command and paths may differ.
- Due to CrateDB’s PostgreSQL protocol compatibility, we can make use of Trino’s [PostgreSQL connector](https://trino.io/docs/current/connector/postgresql.html). Create a new file `/usr/local/Cellar/trino/375/libexec/etc/catalog/postgresql.properties` to configure the connection:
+ Because CrateDB speaks the PostgreSQL wire protocol, use Trino’s [PostgreSQL connector](https://trino.io/docs/current/connector/postgresql.html). Create a catalog properties file to configure the connection:
- Please replace the placeholders for the CrateDB hostname, username, and password to match your setup. Besides the connection details, the configuration has two particularities:
+ Replace the placeholders for the CrateDB hostname, username, and password. Besides the connection details, note two specifics:
- Once the PostgreSQL connector is configured, we can connect to the Trino server using its CLI:
+ After configuring the connector, connect to the Trino server using its CLI:
- A `SHOW TABLES` query should successfully list all existing tables in the specified CrateDB schema and you can proceed with querying them.
+ Run `SHOW TABLES` to list all tables in the specified CrateDB schema, then query them.
- As CrateDB differs in some aspects from PostgreSQL, there are a few particularities to consider for your queries:
+ Because CrateDB differs in some aspects from PostgreSQL, consider the following nuances when writing queries:
- With a few parameter tweaks, Trino can successfully connect to CrateDB. The information presented in this post is the result of a short compatibility test and is likely not exhaustive. If you use Trino with CrateDB and are aware of any additional aspects, please let us know!
+ With a few parameter tweaks, Trino connects to CrateDB. This guide reflects a short compatibility test and is not exhaustive. If you discover additional aspects, please let us know.

Also applies to: 8-8, 10-10, 14-14, 24-24, 30-30, 40-40, 42-42, 54-54


16-22: Add a language to the fenced code block (fixes MD040).

Use INI/properties highlighting for the catalog config:

-```
+```ini
 connector.name=postgresql
 connection-url=jdbc:postgresql://<CrateDB hostname>:5432/
 connection-user=<CrateDB username>
 connection-password=<CrateDB password>
 insert.non-transactional-insert.enabled=true

---

`14-22`: **Prefer stable config paths over versioned Homebrew Cellar paths.**

Point to TRINO_HOME or etc paths that survive upgrades:

```diff
-Due to CrateDB’s PostgreSQL protocol compatibility, we can make use of Trino’s [PostgreSQL connector](https://trino.io/docs/current/connector/postgresql.html). Create a new file `/usr/local/Cellar/trino/375/libexec/etc/catalog/postgresql.properties` to configure the connection:
+Because CrateDB speaks the PostgreSQL wire protocol, use Trino’s [PostgreSQL connector](https://trino.io/docs/current/connector/postgresql.html). Create a catalog file, for example:
+
+- macOS (Homebrew): `/usr/local/etc/trino/catalog/postgresql.properties` (or `/opt/homebrew/etc/trino/catalog/...` on Apple Silicon)
+- Linux (tarball/systemd): `$TRINO_HOME/etc/catalog/postgresql.properties` or `/etc/trino/catalog/postgresql.properties`

Would you like me to adjust the rest of the doc to reference these stable paths consistently?


26-29: Clarify the two specifics with tighter phrasing.

Minor wording to improve scanability:

-* No database name: With PostgreSQL, a JDBC connection URL usually ends with a database name. We intentionally omit the database name when connecting to CrateDB for compatibility reasons.
-CrateDB consists of a single database with multiple schemas, hence we do not specify a database name in the `connection-url`. If a database name is specified, you will run into an error message on certain operations (`ERROR: Table with more than 2 QualifiedName parts is not supported. Only <schema>.<tableName> works`).
-* Disabling transactions: Being a database with eventual consistency, CrateDB doesn’t support transactions. By default, the PostgreSQL connector will wrap `INSERT` queries into transactions and attempt to create a temporary table. We disable this behavior with the `insert.non-transactional-insert.enabled` parameter.
+* No database name: CrateDB provides a single database with multiple schemas, so omit the database name in `connection-url`. Specifying a database triggers errors for operations that include `catalog.schema.table` (e.g., `ERROR: Table with more than 2 QualifiedName parts is not supported. Only <schema>.<tableName> works`).
+* Non‑transactional inserts: CrateDB doesn’t support transactions. By default, the PostgreSQL connector wraps `INSERT` statements in a transaction and uses a temporary table. Disable this with `insert.non-transactional-insert.enabled=true`.

45-45: Offer a practical workaround for catalog.schema.table on INSERT.

Until crate/crate#12658 is resolved, call out two options:

  • Run INSERTs directly against CrateDB (psql/PgJDBC) outside Trino.
  • Or create a view in CrateDB that Trino can target via two-part names and document that DML via Trino may be limited.

Add a short note after this bullet.


50-50: Set expectations on performance with a version stamp.

Add a sentence like “Tested with Trino X.Y and CrateDB Z.W” to time‑box the pushdown behavior, which changes across releases.

docs/integrate/trino/index.md (2)

4-6: Avoid hotlinking external images; vendor the logo.

Hotlinking the logo from Wikimedia introduces an external runtime dependency and potential license/availability issues. Store the asset locally (e.g., _static/img/trino-logo.png) and reference it.

I can add the asset and update references if you confirm the preferred path.


14-16: Tighten the “About” copy.

Slightly reduce marketing tone and keep active voice:

-[Trino] is a fast distributed SQL query engine for big data analytics
-that helps you explore your data universe.
+[Trino] is a fast, distributed SQL query engine for analytics.
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 1f6d85e and d441a7e.

📒 Files selected for processing (3)
  • docs/integrate/index.md (1 hunks)
  • docs/integrate/trino/index.md (1 hunks)
  • docs/integrate/trino/tutorial.md (1 hunks)
🧰 Additional context used
🧠 Learnings (2)
📚 Learning: 2025-08-14T19:02:43.308Z
Learnt from: amotl
PR: crate/cratedb-guide#253
File: docs/integrate/mindsdb/index.md:21-33
Timestamp: 2025-08-14T19:02:43.308Z
Learning: In CrateDB integration documentation examples, default connection parameters (user="crate", password="", host="127.0.0.1") are intentionally used to provide working out-of-the-box examples for users with local CrateDB instances, rather than using placeholder values that require customization.

Applied to files:

  • docs/integrate/trino/tutorial.md
📚 Learning: 2025-08-15T18:56:11.508Z
Learnt from: amotl
PR: crate/cratedb-guide#255
File: docs/integrate/influxdb/tutorial.md:120-123
Timestamp: 2025-08-15T18:56:11.508Z
Learning: In CrateDB integration tutorials, when using `ctk load table` with `--cluster-url="crate://userhost:port/schema/table"`, the verification query should reference the target schema.table combination specified in the cluster URL, not derive it from the source system's naming. For example, if loading from InfluxDB bucket/measurement `testdrive/demo` to CrateDB `doc/testdrive`, the verification query should be `SELECT * FROM doc.testdrive`.

Applied to files:

  • docs/integrate/trino/tutorial.md
🪛 markdownlint-cli2 (0.17.2)
docs/integrate/trino/tutorial.md

16-16: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

🔇 Additional comments (3)
docs/integrate/trino/tutorial.md (1)

16-22: Use working defaults for out‑of‑the‑box local setups.

Per prior guidance, prefer defaults over placeholders to let readers copy/paste and connect to a local CrateDB:

-```ini
-connector.name=postgresql
-connection-url=jdbc:postgresql://<CrateDB hostname>:5432/
-connection-user=<CrateDB username>
-connection-password=<CrateDB password>
-insert.non-transactional-insert.enabled=true
-```
+```ini
+connector.name=postgresql
+connection-url=jdbc:postgresql://127.0.0.1:5432/
+connection-user=crate
+connection-password=
+insert.non-transactional-insert.enabled=true
+```

[ suggest_recommended_refactor ]
This aligns with the “working defaults” preference you set earlier. If you prefer, we can keep both: a default block plus a placeholders block.

docs/integrate/index.md (1)

77-77: LGTM: ToC entry added for Trino.

Entry placement matches surrounding style. No action needed.

docs/integrate/trino/index.md (1)

20-28: Cross‑reference works as intended.

grid-item-card correctly links to trino-tutorial via :link-type: ref.

Please confirm the Sphinx extensions (e.g., sphinx-design) are enabled in this project build, as required by grid/grid-item-card.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
docs/feature/query/index.md (1)

157-158: Use active‑voice link text (keep targets unchanged).

Suggest making the visible link text active while still referencing the same targets.

- - [Analyzing Device Readings with Metadata Integration]
- - [Time Series: Analyzing Weather Data]
+ - [Analyze Device Readings with Metadata Integration][Analyzing Device Readings with Metadata Integration]
+ - [Analyze Weather Data (Time Series)][Time Series: Analyzing Weather Data]
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d441a7e and 4e1a5d8.

📒 Files selected for processing (1)
  • docs/feature/query/index.md (2 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Build docs
🔇 Additional comments (2)
docs/feature/query/index.md (2)

241-242: Same verification for these targets.

Ensure the “project:” links for weather analysis and UNNEST resolve (see script above).


231-233: Confirm "project:" cross‑ref targets exist and Sphinx resolves them

  • Cross‑refs to check: docs/feature/query/index.md (lines 231–233).
  • Found labels:
    • (timeseries-with-metadata) → docs/topic/timeseries/learn/with-metadata.md:2
    • (timeseries-analysis-weather) → docs/topic/timeseries/learn/query.md:2
  • Not found in repo: inserts_bulk_operations, inserts_unnest — add those labels at the intended targets or update the refs.
  • I could not run a Sphinx build in the sandbox (sphinx-build/tee: command not found). Run locally to confirm: sphinx-build -nW -b html docs/ _build/html and fix any unresolved‑ref warnings.

@amotl amotl requested review from hammerhead, kneth and surister and removed request for kneth and surister September 16, 2025 15:40
Copy link
Member

@kneth kneth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have only smaller - and non-blocker - comments


## Connector configuration

Because CrateDB speaks the PostgreSQL wire protocol, use Trino’s [PostgreSQL connector](https://trino.io/docs/current/connector/postgresql.html). Create a catalog properties file to configure the connection:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Because CrateDB speaks the PostgreSQL wire protocol, use Trino’s [PostgreSQL connector](https://trino.io/docs/current/connector/postgresql.html). Create a catalog properties file to configure the connection:
Because CrateDB speaks the PostgreSQL wire protocol, you can use Trino’s [PostgreSQL connector](https://trino.io/docs/current/connector/postgresql.html). Create a catalog properties file to configure the connection:

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. Fixed with 60f22fc.

- macOS (Homebrew): `/usr/local/etc/trino/catalog/postgresql.properties` (or `/opt/homebrew/etc/trino/catalog/...` on Apple Silicon)
- Linux (tarball/systemd): `$TRINO_HOME/etc/catalog/postgresql.properties` or `/etc/trino/catalog/postgresql.properties`

* Querying `OBJECT` columns: Columns of the data type `OBJECT` can usually be queried using the bracket notation, e.g. `SELECT my_object_column['my_object_key'] FROM my_table`. In Trino’s SQL dialect, the identifier needs to be wrapped in double quotes, such as `SELECT "my_object_column['my_object_key']" FROM my_table`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* Querying `OBJECT` columns: Columns of the data type `OBJECT` can usually be queried using the bracket notation, e.g. `SELECT my_object_column['my_object_key'] FROM my_table`. In Trino’s SQL dialect, the identifier needs to be wrapped in double quotes, such as `SELECT "my_object_column['my_object_key']" FROM my_table`.
* Querying `OBJECT` columns: Columns of the data type `OBJECT` can usually be queried using the bracket notation e.g., `SELECT my_object_column['my_object_key'] FROM my_table`. In Trino’s SQL dialect, the identifier needs to be wrapped in double quotes, such as `SELECT "my_object_column['my_object_key']" FROM my_table`.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. Fixed with 60f22fc.

@amotl amotl merged commit 4885d6c into main Sep 17, 2025
3 checks passed
@amotl amotl deleted the trino branch September 17, 2025 14:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

reorganize Moving content around, inside and between other systems.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants