Skip to content

Conversation

@belugabehr
Copy link
Contributor

Make sure you have checked all steps below.

Jira

Tests

  • My PR adds the following unit tests OR does not need testing for this extremely good reason:

Commits

  • My commits all reference Jira issues in their subject lines. In addition, my commits follow the guidelines from "How to write a good git commit message":
    1. Subject is separated from body by a blank line
    2. Subject is limited to 50 characters (not including Jira issue reference)
    3. Subject does not end with a period
    4. Subject uses the imperative mood ("add", not "adding")
    5. Body wraps at 72 characters
    6. Body explains "what" and "why", not "how"

Documentation

  • In case of new functionality, my PR adds documentation that describes how to use it.
    • All the public functions and the classes in the PR contain Javadoc that explain what it does


public void close() {
IOExceptionUtils.closeQuietly(out);
public void close() throws IOException {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The inherited close() method throws IOException so this is fine to do

@Override
public void close() {
arrayOut.close();
out.close();
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

out in this case is wraps arrayOut, and it has no buffering, so it does not need to be explicitly closed, or better, in the future, out should be closed only and arrayOut will be closed implicitly. Anyway, I came to this conclusion because this is how close() is implemented in a couple of other Writer classes as well.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would either close only arrayOut but with comments about what you have written here or close only out and catching the IOException that would never occur and comment in the catch block similarly.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gszadovszky Thanks for the review!

I think this line of work should be pursued in a different ticket. This change makes this PR's changes possible, but also puts it in line with other implementations:

https://github.com/apache/parquet-mr/blob/dc61e510126aaa1a95a46fe39bf1529f394147e9/parquet-column/src/main/java/org/apache/parquet/column/values/plain/FixedLenByteArrayPlainValuesWriter.java#L88-L90

So if we want to change how things are closed (nice, but technically not required) it can be addressed holistically.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am fine fixing this in another ticket. What I think is a key in this change is to put the related comments (out.close is not required because...) in the code as well as in this review.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gszadovszky OK, finally got back to this. Ha. I added a comment. Please consider for merge.

@Override
public void close() {
arrayOut.close();
out.close();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would either close only arrayOut but with comments about what you have written here or close only out and catching the IOException that would never occur and comment in the catch block similarly.

@belugabehr
Copy link
Contributor Author

 Failed to execute goal com.github.siom79.japicmp:japicmp-maven-plugin:0.14.2:cmp (default) on project parquet-common: There is at least one incompatibility: org.apache.parquet.bytes.LittleEndianDataOutputStream.close():METHOD_NOW_THROWS_CHECKED_EXCEPTION -> [Help 1]

Yes. That is true, but by design.

@belugabehr
Copy link
Contributor Author

Doing a little too much here. Just going to swallow exceptions and keep the previous behavior. I can change API in another ticket.

@gszadovszky gszadovszky merged commit 19348dc into apache:master May 14, 2021
elikkatz added a commit to TheWeatherCompany/parquet-mr that referenced this pull request Jun 2, 2021
* 'master' of https://github.com/apache/parquet-mr: (222 commits)
  PARQUET-2052: Integer overflow when writing huge binary using dictionary encoding (apache#910)
  PARQUET-2041: Add zstd to `parquet.compression` description of ParquetOutputFormat Javadoc (apache#899)
  PARQUET-2050: Expose repetition & definition level from ColumnIO (apache#908)
  PARQUET-1761: Lower Logging Level in ParquetOutputFormat (apache#745)
  PARQUET-2046: Upgrade Apache POM to 23 (apache#904)
  PARQUET-2048: Deprecate BaseRecordReader (apache#906)
  PARQUET-1922: Deprecate IOExceptionUtils (apache#825)
  PARQUET-2037: Write INT96 with parquet-avro (apache#901)
  PARQUET-2044: Enable ZSTD buffer pool by default (apache#903)
  PARQUET-2038: Upgrade Jackson version used in parquet encryption. (apache#898)
  Revert "[WIP] Refactor GroupReadSupport to unuse deprecated api (apache#894)"
  PARQUET-2027: Fix calculating directory offset for merge (apache#896)
  [WIP] Refactor GroupReadSupport to unuse deprecated api (apache#894)
  PARQUET-2030: Expose page size row check configurations to ParquetWriter.Builder (apache#895)
  PARQUET-2031: Upgrade to parquet-format 2.9.0 (apache#897)
  PARQUET-1448: Review of ParquetFileReader (apache#892)
  PARQUET-2020: Remove deprecated modules (apache#888)
  PARQUET-2025: Update Snappy version to 1.1.8.3 (apache#893)
  PARQUET-2022: ZstdDecompressorStream should close `zstdInputStream` (apache#889)
  PARQUET-1982: Random access to row groups in ParquetFileReader (apache#871)
  ...

# Conflicts:
#	parquet-column/src/main/java/org/apache/parquet/example/data/simple/SimpleGroup.java
#	parquet-hadoop/pom.xml
#	parquet-hadoop/src/main/java/org/apache/parquet/format/converter/ParquetMetadataConverter.java
#	parquet-hadoop/src/main/java/org/apache/parquet/hadoop/ParquetFileReader.java
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants