Skip to content

Conversation

@XiaokunDing
Copy link
Contributor

Add the Test Util And ORC file writer.
Lack of the metrics information and file length check.

@XiaokunDing XiaokunDing changed the title [WIP] Add the Test Util And ORC file writer [WIP] Add the Test Util And ORC Writer for Spark Mar 21, 2020
@XiaokunDing XiaokunDing reopened this Mar 21, 2020
Copy link
Collaborator

@chenjunjiedada chenjunjiedada left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@shawnding , just left some comments for you. I think it would be better if you could run ./gradlew build before submitting your PR next time.

@XiaokunDing XiaokunDing force-pushed the OrcWriter branch 2 times, most recently from 223ce04 to 8e0aaea Compare March 26, 2020 10:20
@XiaokunDing XiaokunDing changed the title [WIP] Add the Test Util And ORC Writer for Spark [ISSUE #855] Add the Test Util And ORC Writer for Spark Mar 26, 2020
@XiaokunDing
Copy link
Contributor Author

HI @rdsr , would you mind to take another look or give some advices on next steps? thanks

@rdsr
Copy link
Contributor

rdsr commented Mar 28, 2020

Thanks @shawnding . I'll try to review this today!

@rdblue
Copy link
Contributor

rdblue commented Mar 31, 2020

This looks good to me, except for two things:

  1. Map keys are always required if the map is projected, so I'm not sure the map change is correct.
  2. I think it would make more sense to add nested tests later. The majority of these changes are test changes to add a map. I don't think the amount of change here is worth such a narrow set of cases: one string map with the same key/values each time.

@rdblue
Copy link
Contributor

rdblue commented Mar 31, 2020

One more thing that I thought of: if this is adding the ability to write ORC files, then we should make sure all of the parameterized Spark tests that currently test Parquet and Avro are also applied to ORC.

@XiaokunDing
Copy link
Contributor Author

Thanks for your suggestion @rdblue . I will append same complicated nested type to this test。And now if all of the parameter test applied to ORC file should after #199 have merged

@rdblue
Copy link
Contributor

rdblue commented Apr 1, 2020

I will append same complicated nested type to this test

Sorry for the miscommunication: I'm asking you to remove the more complicated nested type from this PR. Let's get the existing tests running on ORC for this PR and then we can add more test cases in a separate one.

@XiaokunDing
Copy link
Contributor Author

Thanks @rdblue , I was removed the nested type.would you mind to take another look or give some advices on next steps?

@rdblue
Copy link
Contributor

rdblue commented Apr 3, 2020

Looks good to me. Thanks for adding this, @shawnding!

@rdblue rdblue merged commit 2dfc455 into apache:master Apr 4, 2020
rdsr pushed a commit to rdsr/incubator-iceberg that referenced this pull request Apr 5, 2020
sunchao pushed a commit to sunchao/iceberg that referenced this pull request May 10, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants