Skip to content

Conversation

@ghost
Copy link

@ghost ghost commented Aug 15, 2014

fix compile error on hadoop 0.23 for the pull request #1924.

@AmplabJenkins
Copy link

Can one of the admins verify this patch?

@marmbrus
Copy link
Contributor

ok to test

@SparkQA
Copy link

SparkQA commented Aug 15, 2014

QA tests have started for PR 1959. This patch merges cleanly.
View progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/18586/consoleFull

@SparkQA
Copy link

SparkQA commented Aug 15, 2014

QA results for PR 1959:
- This patch FAILED unit tests.
- This patch merges cleanly
- This patch adds no public classes

For more information see test ouptut:
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/18586/consoleFull

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about ignoring any file starting with _ ? Hadoop (also) uses this convention, for things like the _SUCCESS file.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately, that would ignore the metadata file "_metadata" as well.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe we should rethink about why we use filterNot here? simple filter works fine here, something like:

val children = fs.listStatus(path).filter { status =>
  val name = status.getPath.getName
  name == ParquetFileWriter.PARQUET_METADATA_FILE || (name(0) != '.' && name(0) != '_')
}

so we can ignore all of hidden/tmp files without _metadata

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with this. Just remove .* and _* except _metadata.

@SparkQA
Copy link

SparkQA commented Aug 23, 2014

QA tests have started for PR 1959 at commit be30793.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Aug 23, 2014

QA tests have finished for PR 1959 at commit be30793.

  • This patch passes unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@liancheng
Copy link
Contributor

LGTM, thanks.

@marmbrus
Copy link
Contributor

Thanks! I've merged this into master and 1.1.

asfgit pushed a commit that referenced this pull request Aug 26, 2014
…ontext.parquetFile

fix compile error on hadoop 0.23 for the pull request #1924.

Author: Chia-Yung Su <[email protected]>

Closes #1959 from joesu/bugfix-spark3011 and squashes the following commits:

be30793 [Chia-Yung Su] remove .* and _* except _metadata
8fe2398 [Chia-Yung Su] add note to explain
40ea9bd [Chia-Yung Su] fix hadoop-0.23 compile error
c7e44f2 [Chia-Yung Su] match syntax
f8fc32a [Chia-Yung Su] filter out tmp dir

(cherry picked from commit 4243bb6)
Signed-off-by: Michael Armbrust <[email protected]>
@asfgit asfgit closed this in 4243bb6 Aug 26, 2014
xiliu82 pushed a commit to xiliu82/spark that referenced this pull request Sep 4, 2014
…ontext.parquetFile

fix compile error on hadoop 0.23 for the pull request apache#1924.

Author: Chia-Yung Su <[email protected]>

Closes apache#1959 from joesu/bugfix-spark3011 and squashes the following commits:

be30793 [Chia-Yung Su] remove .* and _* except _metadata
8fe2398 [Chia-Yung Su] add note to explain
40ea9bd [Chia-Yung Su] fix hadoop-0.23 compile error
c7e44f2 [Chia-Yung Su] match syntax
f8fc32a [Chia-Yung Su] filter out tmp dir
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants