-
Notifications
You must be signed in to change notification settings - Fork 2.9k
Use decimal precision to determine variant decimal type #13692
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -182,10 +182,10 @@ public void testNegativeDouble() { | |
| public void testDecimal4() { | ||
| VariantPrimitive<?> value = | ||
| SerializedPrimitive.from( | ||
| new byte[] {primitiveHeader(8), 0x04, (byte) 0xD2, 0x02, (byte) 0x96, 0x49}); | ||
| new byte[] {primitiveHeader(8), 0x04, (byte) 0x15, (byte) 0xCD, (byte) 0x5B, 0x07}); | ||
|
|
||
| assertThat(value.type()).isEqualTo(PhysicalType.DECIMAL4); | ||
| assertThat(value.get()).isEqualTo(new BigDecimal("123456.7890")); | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. So is the issue here that 123456.7890 == Precision 10 so by the spec it shouldn't be Decimal4 it should be Decimal 8
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. That's correct. Other engines like Spark and Parquet-Java implements properly. |
||
| assertThat(value.get()).isEqualTo(new BigDecimal("12345.6789")); | ||
| } | ||
|
|
||
| @Test | ||
|
|
@@ -208,17 +208,17 @@ public void testDecimal8() { | |
| primitiveHeader(9), | ||
| 0x09, // scale=9 | ||
| (byte) 0xB1, | ||
| 0x1C, | ||
| 0x6C, | ||
| (byte) 0xB1, | ||
| (byte) 0xF4, | ||
| 0x10, | ||
| 0x22, | ||
| 0x11 | ||
| (byte) 0xFA, | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. And here?
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Basically, I'm using the right decimal4 and decimal8 values for testing.
Maybe I should add a check to error out if it's out of the decimal4 range? |
||
| 0x52, | ||
| (byte) 0xE0, | ||
| (byte) 0x4B, | ||
| (byte) 0x9B, | ||
| (byte) 0xB6, | ||
| 0x01 | ||
| }); | ||
|
|
||
| assertThat(value.type()).isEqualTo(PhysicalType.DECIMAL8); | ||
| assertThat(value.get()).isEqualTo(new BigDecimal("1234567890.987654321")); | ||
| assertThat(value.get()).isEqualTo(new BigDecimal("123456789.987654321")); | ||
| } | ||
|
|
||
| @Test | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -186,16 +186,17 @@ public static VariantPrimitive<Long> ofIsoTimestampntz(String value) { | |
| } | ||
|
|
||
| public static VariantPrimitive<BigDecimal> of(BigDecimal value) { | ||
| int bitLength = value.unscaledValue().bitLength(); | ||
| if (bitLength < 32) { | ||
| int precision = value.precision(); | ||
|
|
||
| if (precision >= 1 && precision <= 9) { | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Are these synonymous? I notice this matches our table in the Spec but i'm a little confused about if the previous behavior we wrote here was a bug. Is it possible to have a unscaledValue bitlength < 32 and have a precision above 9?
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. For the value like 123456.7890, the unscaled fits in int32 while the precision is 10. So they are not exactly the same.
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yeah this change makes sense to me, my understanding is the parquet physical type must be determined based on |
||
| return new PrimitiveWrapper<>(PhysicalType.DECIMAL4, value); | ||
| } else if (bitLength < 64) { | ||
| } else if (precision >= 10 && precision <= 18) { | ||
| return new PrimitiveWrapper<>(PhysicalType.DECIMAL8, value); | ||
| } else if (bitLength < 128) { | ||
| } else if (precision <= 38) { | ||
| return new PrimitiveWrapper<>(PhysicalType.DECIMAL16, value); | ||
| } | ||
|
|
||
| throw new UnsupportedOperationException("Unsupported decimal precision: " + value.precision()); | ||
| throw new UnsupportedOperationException("Unsupported decimal precision: " + precision); | ||
| } | ||
|
|
||
| public static VariantPrimitive<ByteBuffer> of(ByteBuffer value) { | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -122,10 +122,10 @@ public class TestVariantReaders { | |
| Variants.ofIsoTimestamptz("1957-11-07T12:33:54.123456+00:00"), | ||
| Variants.ofIsoTimestampntz("2024-11-07T12:33:54.123456"), | ||
| Variants.ofIsoTimestampntz("1957-11-07T12:33:54.123456"), | ||
| Variants.of(new BigDecimal("123456.7890")), // decimal4 | ||
| Variants.of(new BigDecimal("-123456.7890")), // decimal4 | ||
| Variants.of(new BigDecimal("1234567890.987654321")), // decimal8 | ||
| Variants.of(new BigDecimal("-1234567890.987654321")), // decimal8 | ||
|
Comment on lines
-127
to
-128
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. So these changed because they should be |
||
| Variants.of(new BigDecimal("12345.6789")), // decimal4 | ||
| Variants.of(new BigDecimal("-12345.6789")), // decimal4 | ||
| Variants.of(new BigDecimal("123456789.987654321")), // decimal8 | ||
| Variants.of(new BigDecimal("-123456789.987654321")), // decimal8 | ||
| Variants.of(new BigDecimal("9876543210.123456789")), // decimal16 | ||
| Variants.of(new BigDecimal("-9876543210.123456789")), // decimal16 | ||
| Variants.of(ByteBuffer.wrap(new byte[] {0x0a, 0x0b, 0x0c, 0x0d})), | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why change the value here?