Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs: document crdb integration #30873

Open
wants to merge 3 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
38 changes: 19 additions & 19 deletions doc/user/config.toml
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ disableKinds = ['taxonomy']
[params]
repo = "//github.com/MaterializeInc/materialize"
bannerMessage = "Need to run Materialize in your own private or public cloud? Get early access to **self-managed Materialize**!"
bannerLink="https://materialize.com/blog/self-managed/?utm_medium=docs-banner&utm_source=documentation&utm_campaign=FY25_Blogs&sf_campaign=701TR00000NoNFXYA3&utm_term=&utm_content="
bannerLink = "https://materialize.com/blog/self-managed/?utm_medium=docs-banner&utm_source=documentation&utm_campaign=FY25_Blogs&sf_campaign=701TR00000NoNFXYA3&utm_term=&utm_content="

[frontmatter]
publishDate = ["publishDate"]
Expand All @@ -20,7 +20,7 @@ publishDate = ["publishDate"]
[[menu.main]]
identifier = "get-started"
name = "Get started"
weight= 5
weight = 5

#
# Connect sources
Expand All @@ -29,7 +29,7 @@ weight= 5
[[menu.main]]
identifier = "ingest-data"
name = "Ingest data"
weight= 11
weight = 11

[[menu.main]]
name = "SQL Server"
Expand All @@ -53,7 +53,7 @@ weight = 30
identifier = "network-security"
name = "Network security"
parent = 'ingest-data'
weight= 35
weight = 35

#
# Transform data
Expand All @@ -62,7 +62,7 @@ weight= 35
[[menu.main]]
identifier = "transform-data"
name = "Transform data"
weight= 12
weight = 12

#
# Serve results
Expand All @@ -71,56 +71,56 @@ weight= 12
[[menu.main]]
identifier = "serve"
name = "Serve results"
weight= 13
weight = 13

[[menu.main]]
name = "Query using `SELECT`"
parent = "serve"
url = "/sql/select/"
weight= 5
weight = 5

[[menu.main]]
name = "Query using external tools"
identifier = "bi-tools"
parent = "serve"
weight= 10
weight = 10

[[menu.main]]
name = "Subscribe to results (`SUBSCRIBE`)"
parent = "serve"
url = "/sql/subscribe/"
weight= 15
weight = 15

[[menu.main]]
name = "Sink results to external tools"
identifier = "sink"
parent = "serve"
weight= 20
weight = 20

[[menu.main]]
name = "Kafka"
parent = "sink"
url = "/sql/create-sink"
weight= 20
weight = 20

[[menu.main]]
name = "Redpanda"
parent = "sink"
url = "/sql/create-sink"
weight= 30
weight = 30
#
# Manage Materialize
#

[[menu.main]]
identifier = "manage"
name = "Manage Materialize"
weight= 14
weight = 14

[[menu.main]]
name = "Monitoring"
identifier = "monitor"
weight= 5
weight = 5
parent = "manage"

#
Expand All @@ -130,15 +130,15 @@ parent = "manage"
[[menu.main]]
identifier = "reference"
name = "Reference"
weight= 15
weight = 15


[[menu.main]]
identifier = "cs_redpanda"
name = "Redpanda"
parent = "create-source"
url = "/sql/create-source/kafka"
weight= 10
weight = 10

[[menu.main]]
identifier = "csink_redpanda"
Expand Down Expand Up @@ -169,7 +169,7 @@ url = '/sql/create-sink'
[[menu.main]]
identifier = "integrations"
name = "Tools and integrations"
weight= 25
weight = 25

[[menu.main]]
identifier = "cli-reference"
Expand Down Expand Up @@ -204,13 +204,13 @@ weight = 70
name = "Security overview"
parent = "about"
url = "https://materialize.com/security-overview"
weight= 25
weight = 25

[[menu.main]]
name = "Responsible disclosure policy"
parent = "about"
url = "https://materialize.com/securitydisclosure"
weight= 30
weight = 30

[markup.goldmark.renderer]
# allow <a name="link-target">, the old syntax no longer works
Expand Down
29 changes: 29 additions & 0 deletions doc/user/content/ingest-data/cockroachdb/_index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
---
title: "CockroachDB"
description: "Connecting Materialize to a CockroachDB source."
disable_list: true
menu:
main:
parent: "ingest-data"
name: "CockroachDB"
identifier: "crdb"
weight: 16
---

## Change Data Capture (CDC)

Materialize supports CockroachDB as a real-time data source. Using Kafka with
CockroachDB
[Changefeeds](https://www.cockroachlabs.com/docs/stable/change-data-capture-overview),
Materialize can consume changefeeds to create and efficiently maintain real-time
views on top of CDC data.

{{< tip >}}
{{< guided-tour-blurb-for-ingest-data >}}
{{< /tip >}}

## Integration guides

| Integration guides |
| ------------------------------------------- |
| <ul><li>[Kafka + Changefeeds](/ingest-data/cockroachdb/kafka-changefeeds)</li></ul> |
75 changes: 75 additions & 0 deletions doc/user/content/ingest-data/cockroachdb/kafka-changefeeds.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,75 @@
---
title: "CockroachDB CDC using Kafka and Changefeeds"
description: "How to propagate Change Data Capture (CDC) data from a CockroachDB database to Materialize"
menu:
main:
parent: "crdb"
name: "Using Kafka and Changefeeds"
identifier: "crdb-kafka-changefeeds"
weight: 5
---

Change Data Capture (CDC) allows you to track and propagate changes in a CockroachDB database to downstream consumers.
In this guide, we will cover how to use Materialize to create and efficiently maintain real-time views on top of CDC data using Kafka and CockroachDB changefeeds.

{{< tip >}}
{{< guided-tour-blurb-for-ingest-data >}}
{{< /tip >}}

## Kafka + Changefeeds

[Changefeeds](https://www.cockroachlabs.com/docs/stable/change-data-capture-overview) capture row-level changes resulting from `INSERT`, `UPDATE`, and `DELETE` operations against CockroachDB tables and publish them as events to Kafka topics.
Using the [Kafka source](/sql/create-source/kafka/#using-debezium), Materialize can consume these changefeeds making the data available for use in views.

### Database setup

Before creating a changefeed, ensure that the upstream database is configured to support CDC.

1. Enable rangefeeds for the CockroachDB instance:

```sql
SET CLUSTER SETTING kv.rangefeed.enabled = true;
```

### Create Changefeeds

Create one changefeed for each table you want to publish to Materialize.
Each table will produce data to its own Kafka topic that can be consumed by Materialize.
Use the following SQL command:

```sql
CREATE CHANGEFEED FOR TABLE my_table
INTO 'kafka://broker:9092'
WITH format = avro,
confluent_schema_registry = 'http://registry:8081',
diff,
envelope=wrapped
```

Materialize recommends creating changefeeds with using format `avro` and enabling `diff` and using envelope `wrapped`.
This will emits change events using a schema that matches Debezium and contains detailed
information about upstream database operations, like the `before` and `after`
values for each record.
Please refer to the CockroachDB documentation for full details on changefeed configurations.

### Create a source

To interpret the changefeeds, create a source in Materialize using the [Debezium envelope](/sql/create-source/kafka/#using-debezium).

```mzsql
CREATE SOURCE kafka_repl
FROM KAFKA CONNECTION kafka_connection (TOPIC 'my_table')
FORMAT AVRO USING CONFLUENT SCHEMA REGISTRY CONNECTION csr_connection
ENVELOPE DEBEZIUM;
```

By default, the source will be created in the active cluster; to use a different
cluster, use the `IN CLUSTER` clause.

### Create a view

{{% ingest-data/ingest-data-kafka-debezium-view %}}

### Create an index on the view

{{% ingest-data/ingest-data-kafka-debezium-index %}}
10 changes: 10 additions & 0 deletions doc/user/content/integrations/_index.md
Original file line number Diff line number Diff line change
Expand Up @@ -103,6 +103,16 @@ database as a source requires enabling [**GTID-based binlog replication**](/sql/
| MySQL _(direct)_ | {{< supportLevel alpha >}} | See the [source documentation](/sql/create-source/mysql/) for more details, and the relevant integration guide for step-by-step instructions:<p></p><ul><li>[Amazon RDS for MySQL](/ingest-data/mysql/amazon-rds)</li><li>[Amazon Aurora for MySQL](/ingest-data/mysql/amazon-aurora)</li><li>[Azure DB for MySQL](/ingest-data/mysql/azure-db)</li><li>[Google Cloud SQL for MySQL](/ingest-data/mysql/google-cloud-sql)</li><li>[Self-hosted MySQL](/ingest-data/mysql/self-hosted)</li></ul> |
| MySQL _(via Debezium)_ | {{< supportLevel production >}} | See the [MySQL CDC guide](/integrations/cdc-mysql/) for a step-by-step breakdown of the integration. | |

### CockroachDB

CockroachDB is supported as a [**source**](/concepts/sources)
through
[Changefeeds](/ingest-data/cockroachdb/) (via Kafka or Redpanda).

| Service | Support level | Notes | |
| ---------------------- | -------------------------------- | ---------------------------------------------------------------------------------------------------- | ----------- |
| CockroachDB _(via Changefeeds)_ | {{< supportLevel production >}} | See the [CockroachDB Changefeed Guide](/ingest-data/cockroachdb/) for a step-by-step breakdown of the integration. | |

### Other databases

{{< note >}}
Expand Down
Loading