-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-19695][SQL] Throw an exception if a columnNameOfCorruptRecord field violates requirements in json formats
#17023
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
columnNameOfCorruptRecord field violates requirementscolumnNameOfCorruptRecord field violates requirements for json
columnNameOfCorruptRecord field violates requirements for jsoncolumnNameOfCorruptRecord field violates requirements in json formats
|
Test build #73259 has finished for PR 17023 at commit
|
|
@HyukjinKwon @cloud-fan cloud you check this? thanks. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
corruptRecords.toDF("value")?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
map to foreach?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And.. not a strong preference but maybe put this in JacksonUtils and rename it JsonUtils if @cloud-fan is okay?
Maybe we could throws an exception as IlligalArgumentException in the first place and then capture the message with AnalysisException (as JacksonUtils.verifySchema is doing in StructToJson). This is not a strong opinion too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ideally the columnNameOfCorruptRecord stuff has nothing to do with parser. Parser just parses the record and report error if some records are bad, and the upper-level will handle the bad records and may put the bad record in a special string column.
I'm ok to keep this code snippet duplicated in 2 places.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
okay!
|
Thanks for cc'ing me. I am okay if @cloud-fan is okay. |
|
LGTM |
|
Test build #73312 has finished for PR 17023 at commit
|
|
thanks, merging to master! |
…` field violates requirements in json formats ## What changes were proposed in this pull request? This pr comes from apache#16928 and fixed a json behaviour along with the CSV one. ## How was this patch tested? Added tests in `JsonSuite`. Author: Takeshi Yamamuro <[email protected]> Closes apache#17023 from maropu/SPARK-19695.
What changes were proposed in this pull request?
This pr comes from #16928 and fixed a json behaviour along with the CSV one.
How was this patch tested?
Added tests in
JsonSuite.