Skip to content

Conversation

@maropu
Copy link
Member

@maropu maropu commented Feb 22, 2017

What changes were proposed in this pull request?

This pr comes from #16928 and fixed a json behaviour along with the CSV one.

How was this patch tested?

Added tests in JsonSuite.

@maropu maropu changed the title [SPARK-19695][SQL] Throw an exception if a columnNameOfCorruptRecord field violates requirements [SPARK-19695][SQL] Throw an exception if a columnNameOfCorruptRecord field violates requirements for json Feb 22, 2017
@maropu maropu changed the title [SPARK-19695][SQL] Throw an exception if a columnNameOfCorruptRecord field violates requirements for json [SPARK-19695][SQL] Throw an exception if a columnNameOfCorruptRecord field violates requirements in json formats Feb 22, 2017
@SparkQA
Copy link

SparkQA commented Feb 22, 2017

Test build #73259 has finished for PR 17023 at commit 11c2850.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@maropu
Copy link
Member Author

maropu commented Feb 22, 2017

@HyukjinKwon @cloud-fan cloud you check this? thanks.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

corruptRecords.toDF("value")?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

map to foreach?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And.. not a strong preference but maybe put this in JacksonUtils and rename it JsonUtils if @cloud-fan is okay?

Maybe we could throws an exception as IlligalArgumentException in the first place and then capture the message with AnalysisException (as JacksonUtils.verifySchema is doing in StructToJson). This is not a strong opinion too.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ideally the columnNameOfCorruptRecord stuff has nothing to do with parser. Parser just parses the record and report error if some records are bad, and the upper-level will handle the bad records and may put the bad record in a special string column.

I'm ok to keep this code snippet duplicated in 2 places.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

okay!

@HyukjinKwon
Copy link
Member

Thanks for cc'ing me. I am okay if @cloud-fan is okay.

@cloud-fan
Copy link
Contributor

LGTM

@SparkQA
Copy link

SparkQA commented Feb 23, 2017

Test build #73312 has finished for PR 17023 at commit 00e0b7a.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@cloud-fan
Copy link
Contributor

thanks, merging to master!

@asfgit asfgit closed this in 769aa0f Feb 23, 2017
Yunni pushed a commit to Yunni/spark that referenced this pull request Feb 27, 2017
…` field violates requirements in json formats

## What changes were proposed in this pull request?
This pr comes from apache#16928 and fixed a json behaviour along with the CSV one.

## How was this patch tested?
Added tests in `JsonSuite`.

Author: Takeshi Yamamuro <[email protected]>

Closes apache#17023 from maropu/SPARK-19695.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants