Skip to content

Conversation

@rdblue
Copy link
Contributor

@rdblue rdblue commented May 26, 2020

This updates TestMetrics to use Iceberg generic records for its test cases. This is to avoid creating an Avro writer for ORC just to make the tests work.

Using Iceberg generics for this test required moving the generic record classes to core so they are available to TestMetrics in core. Reader and writer classes for Avro are also moved to core.

rdblue added 2 commits May 26, 2020 16:29
This is to support ORC metrics using the same tests. ORC doesn't support
writing Avro records.
@rdblue rdblue requested a review from rdsr May 26, 2020 23:30
@rdblue rdblue mentioned this pull request May 26, 2020
optional(11, "uuidCol", UUIDType.get()),
required(12, "fixedCol", FixedType.ofLength(4)),
required(13, "binaryCol", BinaryType.get())
required(11, "fixedCol", FixedType.ofLength(4)),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry if I'm missing some context, but why is the uuidCol dropped?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

UUID columns aren't currently supported in the Parquet writers for Iceberg generics. Rather than adding that support here, I thought it was better to remove it. We can add it back later, but I don't want to block ORC metrics on it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SGTM. Thanks for the context.

Copy link
Contributor

@rdsr rdsr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1. Minor changes

}

@Test
public void testMetricsForRepeatedValues() throws IOException {
Copy link
Contributor

@rdsr rdsr May 27, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is the test name misleading? Seems like we are testing for multiple records, not repeated values in maps and lists

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it sounds like we were testing repeated values in maps and lists.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this is an existing test name, I'd rather not update it in this PR. We can open a separate one to fix it if you think it is necessary.

import org.junit.Rule;
import org.junit.rules.TemporaryFolder;

/**
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: seems like comment is not adding anything more

optional(11, "uuidCol", UUIDType.get()),
required(12, "fixedCol", FixedType.ofLength(4)),
required(13, "binaryCol", BinaryType.get())
required(11, "fixedCol", FixedType.ofLength(4)),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SGTM. Thanks for the context.

@rdblue rdblue merged commit 9b2e1b4 into apache:master May 27, 2020
@rdblue
Copy link
Contributor Author

rdblue commented May 27, 2020

I'm merging this. Thanks for the reviews @rdsr and @edgarRd!

waterlx added a commit to waterlx/incubator-iceberg that referenced this pull request Jun 3, 2020
cmathiesen pushed a commit to ExpediaGroup/iceberg that referenced this pull request Aug 19, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants