Skip to content
Closed
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -122,7 +122,7 @@ class GenericArrayData(val array: Array[Any]) extends ArrayData {
if (!o2.isInstanceOf[Double] || ! java.lang.Double.isNaN(o2.asInstanceOf[Double])) {
return false
}
case _ => if (o1 != o2) {
case _ => if (!o1.equals(o2)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are there any needs to handle Array[Byte] separately above?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in java byte[] or other primitive arrays doesn't have a proper equals implementation.

scala> Array(1) == Array(1)
res0: Boolean = false

return false
}
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -104,4 +104,13 @@ class ComplexDataSuite extends SparkFunSuite {
// The copied data should not be changed externally.
assert(copied.getStruct(0, 1).getUTF8String(0).toString == "a")
}

test("SPARK-24659: GenericArrayData.equals should respect element type differences") {
// Spark SQL considers array<int> and array<long> to be incompatible,
// so an underlying implementation of array type should return false in this case.
val array1 = new GenericArrayData(Array[Int](123))
val array2 = new GenericArrayData(Array[Long](123L))

assert(!array1.equals(array2))
Copy link
Member

@MaxGekk MaxGekk Jun 26, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you check positive case when two arrays have the same element type? and the same elements.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch! Can you test short and byte, too?


// GenericArrayData
scala> val arrayByte= new GenericArrayData(Array[Short](123.toByte))
scala> val arrayShort = new GenericArrayData(Array[Short](123.toShort))
scala> val arrayInt = new GenericArrayData(Array[Int](123))
scala> val arrayLong = new GenericArrayData(Array[Long](123L))
scala> arrayByte.equals(arrayLong)
res8: Boolean = true

scala> arrayByte.equals(arrayInt)
res9: Boolean = true

scala> arrayShort.equals(arrayInt)
res10: Boolean = true

scala> arrayShort.equals(arrayLong)
res11: Boolean = true


// UnsafeArrayData
scala> val unsafeByte = ExpressionEncoder[Array[Byte]].resolveAndBind().toRow(arrayByte).getArray(0)
scala> val unsafeShort = ExpressionEncoder[Array[Short]].resolveAndBind().toRow(arrayShort).getArray(0)
scala> val unsafeInt = ExpressionEncoder[Array[Int]].resolveAndBind().toRow(arrayInt).getArray(0)
scala> val unsafeLong = ExpressionEncoder[Array[Long]].resolveAndBind().toRow(arrayLong).getArray(0)
scala> arrayByte.equals(arrayLong)
res12: Boolean = false

scala> arrayByte.equals(arrayInt)
res13: Boolean = false

scala> arrayShort.equals(arrayInt)
res14: Boolean = false

scala> arrayShort.equals(arrayLong)
res15: Boolean = false

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

without schema, Spark can never compare a generic and an unsafe array. This fix is not a real bug fix, but makes the GenericArrayData more clear about equals semantic, which is good to have, and might be useful when writing tests.

}
}