-
Notifications
You must be signed in to change notification settings - Fork 3k
Update metrics tests to use Iceberg generics #1070
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This is to support ORC metrics using the same tests. ORC doesn't support writing Avro records.
| optional(11, "uuidCol", UUIDType.get()), | ||
| required(12, "fixedCol", FixedType.ofLength(4)), | ||
| required(13, "binaryCol", BinaryType.get()) | ||
| required(11, "fixedCol", FixedType.ofLength(4)), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry if I'm missing some context, but why is the uuidCol dropped?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
UUID columns aren't currently supported in the Parquet writers for Iceberg generics. Rather than adding that support here, I thought it was better to remove it. We can add it back later, but I don't want to block ORC metrics on it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SGTM. Thanks for the context.
rdsr
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1. Minor changes
| } | ||
|
|
||
| @Test | ||
| public void testMetricsForRepeatedValues() throws IOException { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is the test name misleading? Seems like we are testing for multiple records, not repeated values in maps and lists
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it sounds like we were testing repeated values in maps and lists.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this is an existing test name, I'd rather not update it in this PR. We can open a separate one to fix it if you think it is necessary.
| import org.junit.Rule; | ||
| import org.junit.rules.TemporaryFolder; | ||
|
|
||
| /** |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: seems like comment is not adding anything more
| optional(11, "uuidCol", UUIDType.get()), | ||
| required(12, "fixedCol", FixedType.ofLength(4)), | ||
| required(13, "binaryCol", BinaryType.get()) | ||
| required(11, "fixedCol", FixedType.ofLength(4)), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SGTM. Thanks for the context.
This updates
TestMetricsto use Iceberg generic records for its test cases. This is to avoid creating an Avro writer for ORC just to make the tests work.Using Iceberg generics for this test required moving the generic record classes to core so they are available to
TestMetricsin core. Reader and writer classes for Avro are also moved to core.