Skip to content

Conversation

@aalexandrov
Copy link
Contributor

@aalexandrov aalexandrov commented Jun 18, 2024

According to the official wire protocol documentation1:

In simple Query mode, the format of retrieved values is always text, except when the given command is a FETCH from a cursor declared with the BINARY option.

However, for the binary protocol the situation is more complicated, as this is determined by the Bind2 command3:

Bind also specifies the format to use for any data returned by the query; the format can be specified overall, or per-column.

Given the above infromation, this aligns the implementation here with the sqllite and duckdb examples from the pgwire repository.

To see why the current state is an issue, try running SELECT 1::int4 from a Rust crate that uses rust-postgres. Using the example from this gist:

cargo run --example pg_client -- query "SELECT 1::int8"

Without the fixes I get:

error retrieving column 0: error deserializing column 0: failed to fill whole buffer
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

with the fixes:

┌──────────┐
│ Int64(1) │
├──────────┤
│        1 │
└──────────┘

Footnotes

  1. https://www.postgresql.org/docs/current/protocol-flow.html#PROTOCOL-FLOW-SIMPLE-QUERY

  2. https://www.postgresql.org/docs/current/protocol-message-formats.html#PROTOCOL-MESSAGE-FORMATS-BIND

  3. https://www.postgresql.org/docs/current/protocol-flow.html#PROTOCOL-FLOW-EXT-QUERY

According to the official wire protocol documentation[^1]:

> In simple Query mode, the format of retrieved values is always text,
> except when the given command is a FETCH from a cursor declared with
> the BINARY option.

However, for the binary protocol the situation is more complicated, as
this is determined by the `Bind`[^3] command[^2]:

> Bind also specifies the format to use for any data returned by the
> query; the format can be specified overall, or per-column.

Given the above infromation, this aligns the implementation here with the
`sqllite` and `duckdb` examples from the `pgwire` repository.

To see why the current state is an issue, try running `SELECT 1::int4`
from a Rust crate that uses `rust-postgres`.

[^1]: https://www.postgresql.org/docs/current/protocol-flow.html#PROTOCOL-FLOW-SIMPLE-QUERY
[^2]: https://www.postgresql.org/docs/current/protocol-flow.html#PROTOCOL-FLOW-EXT-QUERY
[^3]: https://www.postgresql.org/docs/current/protocol-message-formats.html#PROTOCOL-MESSAGE-FORMATS-BIND
@sunng87 sunng87 merged commit e39e2cd into datafusion-contrib:master Jun 19, 2024
@sunng87
Copy link
Member

sunng87 commented Jun 19, 2024

Thank you for the patch and detailed explaination. Are you using this example in your own project?

@aalexandrov
Copy link
Contributor Author

aalexandrov commented Jun 20, 2024

Are you using this example in your own project?

Yes, but nothing production-ready! I've forked it and I'm now building a more complete frontend (codename postfusion). Currently it's just a vehicle to explore arrow and datafusion work behind a more familiar interface (also a lot of benchmarks have Postgres/psql-compatible implementations, which makes running them easier). I've found this example and pgwire quite useful because I could save the time of implementing the wire protocol myself :)

I'll see how far I can go with it—if it ends up being anything interesting I'll put it out with an Apache license. In the meantime, if I fix issues or implement features in pgwire I'll try to back-port them to your projects.

@aalexandrov aalexandrov deleted the fix_format branch June 21, 2024 13:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants