Skip to content

postgres: emit full leading-zero groups when binary-decoding sub-1e-8 NUMERIC#31211

Merged
dylan-conway merged 1 commit into
mainfrom
ali/pg-numeric-binary-leading-zeros
May 27, 2026
Merged

postgres: emit full leading-zero groups when binary-decoding sub-1e-8 NUMERIC#31211
dylan-conway merged 1 commit into
mainfrom
ali/pg-numeric-binary-leading-zeros

Conversation

@alii

@alii alii commented May 22, 2026

Copy link
Copy Markdown
Member

The Postgres binary NUMERIC decoder collapsed Postgres' two get_str_from_var counters — a base-10000 digit index and a decimal-place position — into a single index that advanced by 4 but also gated the leading-zero region. For weight <= -3 (values below 1e-8 with no integer part) this walked the leading "0000" groups 4× too fast and dropped one or more of them, shifting every significant digit left by 4 decimals per dropped group (e.g. 1e-9 decoded as 0.000010000). Values ≥ 1e-8, and any value with an integer part, were unaffected; text-protocol decoding was always correct.

Split the two counters back apart, mirroring get_str_from_var: one walks the base-10000 digits, the other counts emitted decimal places and drives truncation to dscale. The leading-zero region now emits a full "0000" group per missing digit position.

Adds a mock-Postgres-server test covering a matrix of sub-1e-8 and boundary values. (Not a port regression — the Zig decoder has the same defect, but Node/psql are correct.)

@alii

alii commented May 22, 2026

Copy link
Copy Markdown
Member Author

@robobun adopt

@robobun

robobun commented May 22, 2026

Copy link
Copy Markdown
Collaborator
Updated 11:22 AM PT - May 22nd, 2026

@alii, your commit bf729d3aa749b8e743b98a29fadb340f748dcb2e passed in Build #56891! 🎉


🧪   To try this PR locally:

bunx bun-pr 31211

That installs a local version of the PR into your bun-31211 executable, so you can run:

bun-31211 --bun

@coderabbitai

coderabbitai Bot commented May 22, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: c3b4754a-581d-4021-84f5-0cb40f6eda94

📥 Commits

Reviewing files that changed from the base of the PR and between 346ce08 and bf729d3.

📒 Files selected for processing (2)
  • src/sql_jsc/postgres/DataCell.rs
  • test/js/sql/postgres-binary-numeric.test.ts

Walkthrough

This PR fixes a bug in PostgreSQL binary NUMERIC parsing where fractional digits were incorrectly rendered when weight <= -3. The fix refactors decimal-digit generation in parse_binary_numeric to use independent counters matching Postgres' logic, and introduces a comprehensive test suite with an in-memory mock Postgres server to validate the decoding.

Changes

Binary NUMERIC Decoding Fix

Layer / File(s) Summary
Binary NUMERIC decoding logic refactor
src/sql_jsc/postgres/DataCell.rs
Simplifies negative weight handling by removing unused scale_start offset, and rewrites dscale > 0 fractional-digit loop to use independent counters d (digit index) and i (decimal-place progress) that correctly read and emit base-10000 digit groups from the wire buffer, matching Postgres' reference behavior.
Test harness and mock Postgres server
test/js/sql/postgres-binary-numeric.test.ts
Introduces packet-building utilities and a minimal in-memory mock Postgres server for protocol handshake and query response, with decodeNumeric helper that manages server startup, Bun SQL client initialization, query execution, result decoding, and resource cleanup.
Test cases and parameterized assertions
test/js/sql/postgres-binary-numeric.test.ts
Defines test cases mapping expected decimal string literals to binary NUMERIC wire formats, including explicit regression cases for weight <= -3 and boundary conditions, with parameterized test assertions validating correct decoding.
🚥 Pre-merge checks | ✅ 4
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title precisely describes the main change: fixing binary NUMERIC decoding for sub-1e-8 values by emitting full leading-zero groups, which directly aligns with the changeset's core objective.
Description check ✅ Passed The description covers the technical problem (collapsed counters causing incorrect decoding), the solution (splitting counters), affected values, testing approach, and relevant context (Zig decoder defect). It fully addresses both required template sections.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Comment @coderabbitai help to get the list of available commands and usage tips.

@robobun

robobun commented May 22, 2026

Copy link
Copy Markdown
Collaborator

🤖 Adopted and verified locally:

  • Fail-before: with main's DataCell.rs (fix reverted), the new mock-server test fails 6/13 cases (1e-9 decodes as 0.000010000, etc.) — same failures on the current release build.
  • Pass-after: with this branch, all 13 cases pass under a debug build; postgres-binary-array-bounds.test.ts (same decoder file) still passes.
  • Decoder change matches Postgres' get_str_from_var (separate base-10000 digit index vs. decimal-place counter); existing real-server NUMERIC tests are unaffected (numeric[] stays text-format, scaled-zero early return untouched).

Review thread about the pre-existing ndigits == 0 early return is answered + resolved (out of scope — current behavior is locked in by an existing test). Watching CI from here.

@github-actions

Copy link
Copy Markdown
Contributor

This PR may be a duplicate of:

  1. postgres: preserve scale for numeric zero on binary/prepared path #29773 - Also fixes the Postgres binary NUMERIC decoder for sub-1e-8 values (leading-zero groups dropped due to conflated counters), targeting the Zig implementation while this PR targets Rust

🤖 Generated with Claude Code

@claude claude Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM — the split-counter rewrite faithfully mirrors get_str_from_var and I traced every test case through the new loop.

Extended reasoning...

Overview

This PR rewrites the fractional-part loop of parse_binary_numeric in src/sql_jsc/postgres/DataCell.rs (~30 lines net) and adds a new mock-server test file. The old code conflated two counters from Postgres' get_str_from_var — the base-10000 digit index and the emitted-decimal-place counter — into a single idx that advanced by 4 while also gating the idx < 0 leading-zero region. For weight <= -3 this consumed the leading-zero region 4× too fast, dropping whole "0000" groups and shifting significant digits left (e.g. 1e-9"0.000010000"). The fix restores two independent counters: d (digit index, += 1) and i (decimal places emitted, += 4), with the digit read gated on d >= 0 && d < ndigits. I hand-traced all 13 test cases through the new loop and each produces exactly the expected literal, including the weight >= 0 mixed integer/fractional case (12345678.000000009) where the cursor-consumption interaction with the integer loop is correct because d starts at weight+1, which is precisely the index of the first unconsumed digit.

Security risks

None. This is pure string formatting of already-length-checked wire bytes; the new code retains the defensive cursor.len() >= 2 guard before each read, and the d < ndigits bound prevents over-reading when the header claims more digits than the payload carries. No allocation-size or indexing math depends on untrusted values beyond what the old code already used.

Level of scrutiny

Moderate — this is a correctness fix in a data-decoding hot path, but the change is tightly scoped to one loop in one function, the reference algorithm (get_str_from_var in numeric.c) is well-known and linked in the source, and the PR ships a 13-case regression matrix covering the broken region (weight <= -3), the boundary (weight = -2, -1), and previously-correct paths (integer part, mixed, zero, negative). The mock-Postgres-server test is self-contained and exercises the actual binary decode path via format=1 in RowDescription.

Other factors

The one inline finding is explicitly flagged as pre-existing (the ndigits == 0 early return drops dscale, yielding "0" instead of "0.00") — it's a cosmetic divergence that predates this PR, isn't worsened by it, and is reasonable to leave for a follow-up. No CODEOWNERS cover this path, no outstanding reviewer comments, and the PR description accurately characterizes the bug and the fix.

Comment thread src/sql_jsc/postgres/DataCell.rs
@dylan-conway dylan-conway merged commit 4ee835b into main May 27, 2026
79 checks passed
@dylan-conway dylan-conway deleted the ali/pg-numeric-binary-leading-zeros branch May 27, 2026 21:25
@alii

alii commented May 28, 2026

Copy link
Copy Markdown
Member Author

@robobun can you make a followup PR that swaps this test to use the real postgres server available in both your container and in CI? a mock server is a bad idea

@robobun

robobun commented May 28, 2026

Copy link
Copy Markdown
Collaborator

Done — followup PR: #31508

Swaps the mock server in postgres-binary-numeric.test.ts for the real postgres_plain docker-compose service (same infra as sql.test.ts, gated on isDockerEnabled()). A plain sql`select …::numeric` query goes through the extended protocol, so it exercises the binary NUMERIC decoder on a real server — no mock needed. Verified it fails before the fix (1e-90.000010000`) and passes after.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants