Skip to content

Conversation

@aray
Copy link
Contributor

@aray aray commented Dec 5, 2016

What changes were proposed in this pull request?

Fixes compile errors in generated code when user has case class with a scala.collections.immutable.Map instead of a scala.collections.Map. Since ArrayBasedMapData.toScalaMap returns the immutable version we can make it work with both.

How was this patch tested?

Additional unit tests.

@SparkQA
Copy link

SparkQA commented Dec 6, 2016

Test build #69696 has finished for PR 16161 at commit 699cc75.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

StaticInvoke(
ArrayBasedMapData.getClass,
ObjectType(classOf[Map[_, _]]),
ObjectType(classOf[scala.collection.immutable.Map[_, _]]),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know this bit well, but the comments on the JIRA indicate that immutable maps aren't supported. However, this change should be a no-op because Predef.Map is already immutable.Map right -- how does this address it?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@srowen scala.collection.Map is imported at the beginning of RowEncoder and ScalaReflection.

@rxin
Copy link
Contributor

rxin commented Dec 6, 2016

cc @cloud-fan

@cloud-fan
Copy link
Contributor

can you highlight your approach to fix this bug? It looks to me that this is not a general fix, e.g. mutable.Map will still fail.

@aray
Copy link
Contributor Author

aray commented Dec 6, 2016

The approach is to change the deserializer (via ScalaReflection#deserializerFor) to return the more specific type scala.collections.immutable.Map instead of scala.collections.Map as it does now (no change to ArrayBasedMapData#toScalaMap is needed as it already returns an immutable map). This allows the generated code for case classes using either of those Map's to work as they are both assignable from scala.collections.immutable.Map. This PR does not address scala.collections.mutable.Map, it still wont work.

@SparkQA
Copy link

SparkQA commented Dec 6, 2016

Test build #69741 has finished for PR 16161 at commit f325e75.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@srowen
Copy link
Member

srowen commented Dec 6, 2016

OK, it sounds like what you're doing is unsupported at the moment and this doesn't change that, and you have a workaround. Is this worth merging?

@aray
Copy link
Contributor Author

aray commented Dec 6, 2016

Right now it's not supported to have the following:

case class Foo(a: Map[Int, Int])

(using the scala Predef version of Map)

The documented way to do this is:

case class Foo(a: scala.collections.Map[Int, Int])

and it works fine.

However if someone did not read the documentation carefully and used the former, instead of a reasonable error they get a compile error on the code spark generates. Therefore IMHO there are two options:

  1. Make the generated code work with either version of Map (this PR)
  2. Intercept definitions that use the wrong version of Map and issue a proper error.

If the consensus is that this PR is not worth it then I'll be happy to work on option 2. But in my opinion as a Spark user option 1 is better.

@cloud-fan
Copy link
Contributor

option 1 is better, but this PR only adds the support for scala.collection.immutable.Map, with a very specific way: we can't extend it to support scala.collection.mutable.Map or List.

I think we need a general solution to support custom collection type.

@aray
Copy link
Contributor Author

aray commented Dec 7, 2016

I would be happy to create a seperate PR for adding support for mutable.Map (and List) if that is wanted. But there is no generic solution as there is no type that is assignable to both mutable.Map and immutable.Map.

@cloud-fan
Copy link
Contributor

there is no type that is assignable to both mutable.Map and immutable.Map

Yea, and a general solution would be, remember the original type and create that type when converting catalyst value to external value.

But this may be hard to implement, it may worth to merge this simple PR, as the scala.collection.immutable.Map is so common in scala. cc @rxin @yhuai @liancheng

…-codegen

# Conflicts:
#	sql/core/src/test/scala/org/apache/spark/sql/DatasetSuite.scala
@SparkQA
Copy link

SparkQA commented Dec 12, 2016

Test build #70027 has finished for PR 16161 at commit e0b3e76.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@cloud-fan
Copy link
Contributor

thanks, merging to master!

@asfgit asfgit closed this in 46d30ac Dec 13, 2016
robert3005 pushed a commit to palantir/spark that referenced this pull request Dec 15, 2016
…able.Map also

## What changes were proposed in this pull request?

Fixes compile errors in generated code when user has case class with a `scala.collections.immutable.Map` instead of a `scala.collections.Map`. Since ArrayBasedMapData.toScalaMap returns the immutable version we can make it work with both.

## How was this patch tested?

Additional unit tests.

Author: Andrew Ray <[email protected]>

Closes apache#16161 from aray/fix-map-codegen.
uzadude pushed a commit to uzadude/spark that referenced this pull request Jan 27, 2017
…able.Map also

## What changes were proposed in this pull request?

Fixes compile errors in generated code when user has case class with a `scala.collections.immutable.Map` instead of a `scala.collections.Map`. Since ArrayBasedMapData.toScalaMap returns the immutable version we can make it work with both.

## How was this patch tested?

Additional unit tests.

Author: Andrew Ray <[email protected]>

Closes apache#16161 from aray/fix-map-codegen.
@liancheng
Copy link
Contributor

liancheng commented Feb 10, 2017

Shall we backport this to branch-2.1? I'd consider this as a bug because the following snippet fail in Spark 2.1:

case class Wrapper1(value: Option[Map[String, String]])
case class Wrapper2(value: Map[String, String])

val ds1 = Seq.empty[Wrapper1].toDS()  // Fail
val ds2 = Seq.empty[Wrapper2].toDS()  // Succeed

@cloud-fan
Copy link
Contributor

I'm fine to backport this

@liancheng
Copy link
Contributor

Thanks. Backported to branch-2.1.

asfgit pushed a commit that referenced this pull request Feb 11, 2017
…able.Map also

## What changes were proposed in this pull request?

Fixes compile errors in generated code when user has case class with a `scala.collections.immutable.Map` instead of a `scala.collections.Map`. Since ArrayBasedMapData.toScalaMap returns the immutable version we can make it work with both.

## How was this patch tested?

Additional unit tests.

Author: Andrew Ray <[email protected]>

Closes #16161 from aray/fix-map-codegen.

(cherry picked from commit 46d30ac)
Signed-off-by: Cheng Lian <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants