-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-25734][SQL] Literal should have a value corresponding to dataType #22724
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
LGTM |
| case StringType => v.isInstanceOf[UTF8String] | ||
| case _: StructType => v.isInstanceOf[InternalRow] | ||
| case _: ArrayType => v.isInstanceOf[ArrayData] | ||
| case _: MapType => v.isInstanceOf[MapData] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should validate recursively for StructType, ArrayType, and MapType?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah good point!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yea, good catch ....!
|
Test build #97384 has finished for PR 22724 at commit
|
| } | ||
| require(v == null || doValidate(v, dataType), | ||
| s"Literal must have a corresponding value to ${dataType.catalogString}, " + | ||
| s"but ${if (v != null) s"class ${Utils.getSimpleName(v.getClass)}" else "null"} found.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but null found cannot happen logically.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh, yes. I'll update.
|
Test build #97419 has finished for PR 22724 at commit
|
| case BinaryType => v.isInstanceOf[Array[Byte]] | ||
| case StringType => v.isInstanceOf[UTF8String] | ||
| case st: StructType => | ||
| v.isInstanceOf[GenericInternalRow] && { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is definitely not true. can we do
v.isInstanceOf[InternalRow] && {
val row = v.asInstanceOf[InternalRow]
st.fields.map(_.dataType).zipWithIndex.foreach {
case (dt, i) => doValidate(row.get(i, dt), dt)
}
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah, yes. ok, I'll fix.
| doValidate(map.keyArray.array.head, mt.keyType) && | ||
| doValidate(map.valueArray.array.head, mt.valueType) | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm wondering whether we don't need to check the whole elements for ArrayType and MapType.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since the whole element check seems to be expensive, the current one is ok to me.
HyukjinKwon
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM too if the tests pass
| } | ||
| case ObjectType(cls) => cls.isInstance(v) | ||
| case udt: UserDefinedType[_] => doValidate(v, udt.sqlType) | ||
| case _ => false |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to add NullType case?
nvm, not needed usually.
|
Test build #97427 has finished for PR 22724 at commit
|
| case BinaryType => v.isInstanceOf[Array[Byte]] | ||
| case StringType => v.isInstanceOf[UTF8String] | ||
| case st: StructType => | ||
| v.isInstanceOf[InternalRow] && { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we do the same for array and map?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
| case mt: MapType => | ||
| v.isInstanceOf[ArrayBasedMapData] && { | ||
| val map = v.asInstanceOf[ArrayBasedMapData] | ||
| map.numElements() == 0 || { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we don't need this. The array validation already consider numElements
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you suggested like this?
case mt: MapType =>
v.isInstanceOf[MapData] && {
val map = v.asInstanceOf[MapData]
doValidate(map.keyArray(), ArrayType(mt.keyType)) &&
doValidate(map.valueArray(), ArrayType(mt.valueType))
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yup
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated.
|
Test build #97431 has finished for PR 22724 at commit
|
|
LGTM pending jenkins |
|
Test build #97432 has finished for PR 22724 at commit
|
|
I'm checking the reason of the test failures... |
|
Test build #97433 has finished for PR 22724 at commit
|
|
Test build #97439 has finished for PR 22724 at commit
|
|
Test build #97443 has finished for PR 22724 at commit
|
|
LGTM, but this is a breaking change, so I'd suggest to do it only for 3.0 and add a note to the migration guide. WDYT? |
|
yea, its ok to merge this into v3.0 only. But, we need to update the guide? |
|
|
|
yes, you're right. So we don't need any migration note. But what if a user uses |
|
yea +1 on 3.0 only, this is kind of a developer API, advanced users may use it. |
|
Test build #97451 has finished for PR 22724 at commit
|
|
Test build #97457 has finished for PR 22724 at commit
|
dongjoon-hyun
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1, LGTM.
|
thanks, merging to master! |
## What changes were proposed in this pull request? `Literal.value` should have a value a value corresponding to `dataType`. This pr added code to verify it and fixed the existing tests to do so. ## How was this patch tested? Modified the existing tests. Closes apache#22724 from maropu/SPARK-25734. Authored-by: Takeshi Yamamuro <[email protected]> Signed-off-by: Wenchen Fan <[email protected]>
What changes were proposed in this pull request?
Literal.valueshould have a value a value corresponding todataType. This pr added code to verify it and fixed the existing tests to do so.How was this patch tested?
Modified the existing tests.