pg_duckdb: Official PostgreSQL Extension for DuckDB

PostgreSQL extension for DuckDB

pg_duckdb: Official PostgreSQL Extension for DuckDB

pg_duckdb integrates DuckDB's columnar-vectorized analytics engine into PostgreSQL, enabling high-performance analytics and data-intensive applications. Built in collaboration with Hydra and MotherDuck.

Key Features

Execute analytics queries without changes: run your existing SQL analytics queries as you normally would, and pg_duckdb will automatically use DuckDB's SQL engine to execute them when you set duckdb.force_execution=true.
Read/write data from data lakes: Read/write* Parquet, CSV, JSON, Iceberg & Delta Lake from S3, GCS, Azure & R2.
Integration with cloud analytics: Out-of-the-box support of MotherDuck as compute provider.

How `pg_duckdb` works

pg_duckdb automatically accelerates your existing analytical queries.

No special syntax needed: for queries on your regular PostgreSQL tables, you don't need to change your SQL. Just run your SELECT statements as you normally would, and pg_duckdb will use DuckDB's engine to execute them.
No data export required: You do not need to export your data to Parquet or any other format. pg_duckdb works directly with your existing PostgreSQL tables.

See it in action

Querying your existing PostgreSQL data

This is the most common and straightforward use case. If you have a standard PostgreSQL table, you can query it using standard SQL.

Example:

Let's say you have a PostgreSQL table named orders (to create it, see syntax guide). To run an analytical query, you just write standard SQL, configure duckdb.force_execution and pg_duckdb will handle the rest.

SET duckdb.force_execution = true;
SELECT
    order_date,
    COUNT(*) AS number_of_orders,
    SUM(amount) AS total_revenue
FROM
    orders
GROUP BY
    order_date
ORDER BY
    order_date;

Querying external data (your first data lake query)

pg_duckdb allows you to query external files (like Parquet or CSV) as if they were tables in your database. This is perfect for querying data lakes from pg_duckdb. To learn more on these functions, see read functions documentation.

-- Setup S3 access in seconds directly from SQL
SELECT duckdb.create_simple_secret(
    type := 'S3', key_id := 'your_key', secret := 'your_secret', region := 'us-east-1'
);

SELECT
    r['product_name'], -- 'r' is to iterate on the row object returned from read_parquet()
    AVG(r['rating']) AS average_rating
FROM
    read_parquet('s3://your-bucket/reviews.parquet') r
GROUP BY
    r['product_name']
ORDER BY
    average_rating DESC;

Combining PostgreSQL and DuckDB data

You can easily join your PostgreSQL tables with external data from your data lake.

-- Join a PostgreSQL table with a remote Parquet file
SELECT
    o.product_name,
    o.total_revenue,
    r.average_rating
FROM
    (
        -- First, aggregate our local orders data
        SELECT
            product_name,
            SUM(amount) AS total_revenue
        FROM
            orders
        GROUP BY
            product_name
    ) o
JOIN
    (
        -- Then, aggregate our remote reviews data
        SELECT
            r['product_name'] AS product_name,
            AVG(r['rating']) AS average_rating
        FROM
            read_parquet('s3://your-bucket/reviews.parquet') r
        GROUP BY
            r['product_name']
    ) r ON o.product_name = r.product_name
ORDER BY
    o.total_revenue DESC;

Modern DataLake Formats

Work with modern data formats like DuckLake, Iceberg and Delta Lake. To learn more, see extensions documentation.

-- Query Apache Iceberg tables with time travel
SELECT duckdb.install_extension('iceberg');
SELECT * FROM iceberg_scan('s3://warehouse/sales_iceberg', version := '2024-03-15-snapshot')

-- Process Delta Lake with schema evolution
SELECT duckdb.install_extension('delta');
SELECT * FROM delta_scan('s3://lakehouse/user_events')

MotherDuck integration (optional)

pg_duckdb integrates with MotherDuck, a cloud analytics platform. This allows you to run your queries on MotherDuck's powerful compute infrastructure, while still using your existing PostgreSQL tables.

To learn more, see MotherDuck documentation.

-- Connect to MotherDuck
CALL duckdb.enable_motherduck('<your_motherduck_token>');

-- Your existing MotherDuck tables appear automatically
SELECT region, COUNT(*) FROM my_cloud_analytics_table;

-- Create cloud tables that sync across teams
CREATE TABLE real_time_kpis USING duckdb AS
SELECT
    date_trunc('day', created_at) as date,
    COUNT(*) as daily_signups,
    SUM(revenue) as daily_revenue
FROM user_events
GROUP BY date;

Quick Start

Docker

Run PostgreSQL with pg_duckdb pre-installed in a docker container:

docker run -d -e POSTGRES_PASSWORD=duckdb pgduckdb/pgduckdb:17-v1.0.0

With MotherDuck:

export MOTHERDUCK_TOKEN=<your_token>
docker run -d -e POSTGRES_PASSWORD=duckdb -e MOTHERDUCK_TOKEN pgduckdb/pgduckdb:17-v1.0.0

Try with Hydra

You can also get started using Hydra:

pip install hydra-cli
hydra

Package Managers

pgxman (apt):

pgxman install pg_duckdb

Compile from source:

git clone https://github.com/duckdb/pg_duckdb
cd pg_duckdb
make install

See compilation guide for detailed instructions.

Configuration

See settings documentation for complete configuration options.

Documentation

Topic	Description
Functions	Complete function reference
Syntax Guide & Gotchas	Quick reference for common SQL patterns and things to know
Types	Supported data types and advanced types usage
MotherDuck	Cloud integration guide
Secrets	Credential management
Extensions	DuckDB extension usage
Transactions	Transaction behavior
Compilation	Build from source

Note: Advanced DuckDB types (STRUCT, MAP, UNION) require DuckDB execution context. Use duckdb.query() for complex type operations and TEMP tables for DuckDB table creation in most cases. See Types documentation for details.

Performance

pg_duckdb excels at:

Analytical Workloads: Aggregations, window functions, complex JOINs
Data Lake Queries: Scanning large Parquet/CSV files
Mixed Workloads: Combining OLTP (PostgreSQL) with OLAP (DuckDB)
ETL Pipelines: Transform and load data at scale

Contributing

We welcome contributions! Please see:

Contributing Guidelines
Code of Conduct
Project Milestones for upcoming features
Discussions for feature requests
Issues for bug reports
Join the DuckDB Discord community then chat in the #pg_duckdb channel.

Support

Documentation: Complete documentation
Community: DuckDB Discord #pg_duckdb channel
Issues: GitHub Issues
Commercial: Hydra and MotherDuck offer commercial support

Requirements

PostgreSQL: 14, 15, 16, 17, 18
Operating Systems: Ubuntu 22.04-24.04, macOS

License

Licensed under the MIT License.

Built with ❤️
in collaboration with Hydra and MotherDuck

Name		Name	Last commit message	Last commit date
Latest commit History 584 Commits
.github		.github
docker		docker
docs		docs
include/pgduckdb		include/pgduckdb
scripts		scripts
sql		sql
src		src
test		test
third_party		third_party
.clang-format		.clang-format
.clang-tidy		.clang-tidy
.dockerignore		.dockerignore
.editorconfig		.editorconfig
.gitattributes		.gitattributes
.gitignore		.gitignore
.gitmodules		.gitmodules
CHANGELOG.md		CHANGELOG.md
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
Dockerfile		Dockerfile
LICENSE		LICENSE
Makefile		Makefile
Makefile.global		Makefile.global
NOTICE		NOTICE
README.md		README.md
dev_requirements.txt		dev_requirements.txt
docker-bake.hcl		docker-bake.hcl
docker-compose.yml		docker-compose.yml
logo-dark.svg		logo-dark.svg
logo-light.svg		logo-light.svg
pg_duckdb.control		pg_duckdb.control
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

pg_duckdb: Official PostgreSQL Extension for DuckDB

Key Features

How `pg_duckdb` works

See it in action

Querying your existing PostgreSQL data

Querying external data (your first data lake query)

Combining PostgreSQL and DuckDB data

Modern DataLake Formats

MotherDuck integration (optional)

Quick Start

Docker

Try with Hydra

Package Managers

Configuration

Documentation

Performance

Contributing

Support

Requirements

License

About

Uh oh!

Releases 4

Packages

Uh oh!

Contributors 42

Uh oh!

Languages

License

duckdb/pg_duckdb

Folders and files

Latest commit

History

Repository files navigation

pg_duckdb: Official PostgreSQL Extension for DuckDB

Key Features

How pg_duckdb works

See it in action

Querying your existing PostgreSQL data

Querying external data (your first data lake query)

Combining PostgreSQL and DuckDB data

Modern DataLake Formats

MotherDuck integration (optional)

Quick Start

Docker

Try with Hydra

Package Managers

Configuration

Documentation

Performance

Contributing

Support

Requirements

License

About

Resources

License

Code of conduct

Contributing

Uh oh!

Stars

Watchers

Forks

Releases 4

Packages 0

Uh oh!

Contributors 42

Uh oh!

Languages

How `pg_duckdb` works

Packages