postgres: emit full leading-zero groups when binary-decoding sub-1e-8 NUMERIC#31211
Conversation
|
@robobun adopt |
|
Updated 11:22 AM PT - May 22nd, 2026
✅ @alii, your commit bf729d3aa749b8e743b98a29fadb340f748dcb2e passed in 🧪 To try this PR locally: bunx bun-pr 31211That installs a local version of the PR into your bun-31211 --bun |
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: ASSERTIVE Plan: Pro Run ID: 📒 Files selected for processing (2)
WalkthroughThis PR fixes a bug in PostgreSQL binary NUMERIC parsing where fractional digits were incorrectly rendered when ChangesBinary NUMERIC Decoding Fix
🚥 Pre-merge checks | ✅ 4✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Comment |
|
🤖 Adopted and verified locally:
Review thread about the pre-existing |
|
This PR may be a duplicate of:
🤖 Generated with Claude Code |
There was a problem hiding this comment.
LGTM — the split-counter rewrite faithfully mirrors get_str_from_var and I traced every test case through the new loop.
Extended reasoning...
Overview
This PR rewrites the fractional-part loop of parse_binary_numeric in src/sql_jsc/postgres/DataCell.rs (~30 lines net) and adds a new mock-server test file. The old code conflated two counters from Postgres' get_str_from_var — the base-10000 digit index and the emitted-decimal-place counter — into a single idx that advanced by 4 while also gating the idx < 0 leading-zero region. For weight <= -3 this consumed the leading-zero region 4× too fast, dropping whole "0000" groups and shifting significant digits left (e.g. 1e-9 → "0.000010000"). The fix restores two independent counters: d (digit index, += 1) and i (decimal places emitted, += 4), with the digit read gated on d >= 0 && d < ndigits. I hand-traced all 13 test cases through the new loop and each produces exactly the expected literal, including the weight >= 0 mixed integer/fractional case (12345678.000000009) where the cursor-consumption interaction with the integer loop is correct because d starts at weight+1, which is precisely the index of the first unconsumed digit.
Security risks
None. This is pure string formatting of already-length-checked wire bytes; the new code retains the defensive cursor.len() >= 2 guard before each read, and the d < ndigits bound prevents over-reading when the header claims more digits than the payload carries. No allocation-size or indexing math depends on untrusted values beyond what the old code already used.
Level of scrutiny
Moderate — this is a correctness fix in a data-decoding hot path, but the change is tightly scoped to one loop in one function, the reference algorithm (get_str_from_var in numeric.c) is well-known and linked in the source, and the PR ships a 13-case regression matrix covering the broken region (weight <= -3), the boundary (weight = -2, -1), and previously-correct paths (integer part, mixed, zero, negative). The mock-Postgres-server test is self-contained and exercises the actual binary decode path via format=1 in RowDescription.
Other factors
The one inline finding is explicitly flagged as pre-existing (the ndigits == 0 early return drops dscale, yielding "0" instead of "0.00") — it's a cosmetic divergence that predates this PR, isn't worsened by it, and is reasonable to leave for a follow-up. No CODEOWNERS cover this path, no outstanding reviewer comments, and the PR description accurately characterizes the bug and the fix.
|
@robobun can you make a followup PR that swaps this test to use the real postgres server available in both your container and in CI? a mock server is a bad idea |
|
Done — followup PR: #31508 Swaps the mock server in |
The Postgres binary NUMERIC decoder collapsed Postgres' two
get_str_from_varcounters — a base-10000 digit index and a decimal-place position — into a single index that advanced by 4 but also gated the leading-zero region. Forweight <= -3(values below 1e-8 with no integer part) this walked the leading "0000" groups 4× too fast and dropped one or more of them, shifting every significant digit left by 4 decimals per dropped group (e.g.1e-9decoded as0.000010000). Values ≥ 1e-8, and any value with an integer part, were unaffected; text-protocol decoding was always correct.Split the two counters back apart, mirroring
get_str_from_var: one walks the base-10000 digits, the other counts emitted decimal places and drives truncation todscale. The leading-zero region now emits a full "0000" group per missing digit position.Adds a mock-Postgres-server test covering a matrix of sub-1e-8 and boundary values. (Not a port regression — the Zig decoder has the same defect, but Node/psql are correct.)