[SPARK-10327][SQL] Cache Table is not working while subquery has alias in its project list#8494
Closed
chenghao-intel wants to merge 2 commits intoapache:masterfrom
Closed
[SPARK-10327][SQL] Cache Table is not working while subquery has alias in its project list#8494chenghao-intel wants to merge 2 commits intoapache:masterfrom
chenghao-intel wants to merge 2 commits intoapache:masterfrom
Conversation
Contributor
Author
|
cc @marmbrus |
|
Test build #41720 has finished for PR 8494 at commit
|
Contributor
There was a problem hiding this comment.
Why does this have ()? It also needs scala doc. However, since its only ever used once, I'd consider just inlining it.
bfd40d9 to
fc63b89
Compare
|
Test build #41812 has finished for PR 8494 at commit
|
|
Test build #41852 has finished for PR 8494 at commit
|
Contributor
|
Thanks, merging to master. |
marmbrus
added a commit
to marmbrus/spark
that referenced
this pull request
Sep 8, 2015
asfgit
pushed a commit
that referenced
this pull request
Sep 8, 2015
Author: Michael Armbrust <michael@databricks.com> Closes #8659 from marmbrus/testBuildBreak.
feynmanliang
pushed a commit
to feynmanliang/spark
that referenced
this pull request
Sep 10, 2015
* apache/master: (65 commits) [SPARK-10065] [SQL] avoid the extra copy when generate unsafe array [SPARK-10497] [BUILD] [TRIVIAL] Handle both locations for JIRAError with python-jira [MINOR] [MLLIB] [ML] [DOC] fixed typo: label for negative result should be 0.0 (original: 1.0) [SPARK-9772] [PYSPARK] [ML] Add Python API for ml.feature.VectorSlicer [SPARK-9730] [SQL] Add Full Outer Join support for SortMergeJoin [SPARK-10461] [SQL] make sure `input.primitive` is always variable name not code at `GenerateUnsafeProjection` [SPARK-10481] [YARN] SPARK_PREPEND_CLASSES make spark-yarn related jar could n… [SPARK-10117] [MLLIB] Implement SQL data source API for reading LIBSVM data [SPARK-10227] fatal warnings with sbt on Scala 2.11 [SPARK-10249] [ML] [DOC] Add Python Code Example to StopWordsRemover User Guide [SPARK-9654] [ML] [PYSPARK] Add IndexToString to PySpark [SPARK-10094] Pyspark ML Feature transformers marked as experimental [SPARK-10373] [PYSPARK] move @SInCE into pyspark from sql [SPARK-10464] [MLLIB] Add WeibullGenerator for RandomDataGenerator [SPARK-9834] [MLLIB] implement weighted least squares via normal equation [SPARK-10071] [STREAMING] Output a warning when writing QueueInputDStream and throw a better exception when reading QueueInputDStream [RELEASE] Add more contributors & only show names in release notes. [HOTFIX] Fix build break caused by apache#8494 [SPARK-10327] [SQL] Cache Table is not working while subquery has alias in its project list [SPARK-10492] [STREAMING] [DOCUMENTATION] Update Streaming documentation about rate limiting and backpressure ...
mbautin
pushed a commit
to mbautin/spark
that referenced
this pull request
Oct 27, 2015
…as in its project list
```scala
import org.apache.spark.sql.hive.execution.HiveTableScan
sql("select key, value, key + 1 from src").registerTempTable("abc")
cacheTable("abc")
val sparkPlan = sql(
"""select a.key, b.key, c.key from
|abc a join abc b on a.key=b.key
|join abc c on a.key=c.key""".stripMargin).queryExecution.sparkPlan
assert(sparkPlan.collect { case e: InMemoryColumnarTableScan => e }.size === 3) // failed
assert(sparkPlan.collect { case e: HiveTableScan => e }.size === 0) // failed
```
The actual plan is:
```
== Parsed Logical Plan ==
'Project [unresolvedalias('a.key),unresolvedalias('b.key),unresolvedalias('c.key)]
'Join Inner, Some(('a.key = 'c.key))
'Join Inner, Some(('a.key = 'b.key))
'UnresolvedRelation [abc], Some(a)
'UnresolvedRelation [abc], Some(b)
'UnresolvedRelation [abc], Some(c)
== Analyzed Logical Plan ==
key: int, key: int, key: int
Project [key#14,key#61,key#66]
Join Inner, Some((key#14 = key#66))
Join Inner, Some((key#14 = key#61))
Subquery a
Subquery abc
Project [key#14,value#15,(key#14 + 1) AS _c2#16]
MetastoreRelation default, src, None
Subquery b
Subquery abc
Project [key#61,value#62,(key#61 + 1) AS _c2#58]
MetastoreRelation default, src, None
Subquery c
Subquery abc
Project [key#66,value#67,(key#66 + 1) AS _c2#63]
MetastoreRelation default, src, None
== Optimized Logical Plan ==
Project [key#14,key#61,key#66]
Join Inner, Some((key#14 = key#66))
Project [key#14,key#61]
Join Inner, Some((key#14 = key#61))
Project [key#14]
InMemoryRelation [key#14,value#15,_c2#16], true, 10000, StorageLevel(true, true, false, true, 1), (Project [key#14,value#15,(key#14 + 1) AS _c2#16]), Some(abc)
Project [key#61]
MetastoreRelation default, src, None
Project [key#66]
MetastoreRelation default, src, None
== Physical Plan ==
TungstenProject [key#14,key#61,key#66]
BroadcastHashJoin [key#14], [key#66], BuildRight
TungstenProject [key#14,key#61]
BroadcastHashJoin [key#14], [key#61], BuildRight
ConvertToUnsafe
InMemoryColumnarTableScan [key#14], (InMemoryRelation [key#14,value#15,_c2#16], true, 10000, StorageLevel(true, true, false, true, 1), (Project [key#14,value#15,(key#14 + 1) AS _c2#16]), Some(abc))
ConvertToUnsafe
HiveTableScan [key#61], (MetastoreRelation default, src, None)
ConvertToUnsafe
HiveTableScan [key#66], (MetastoreRelation default, src, None)
```
Author: Cheng Hao <hao.cheng@intel.com>
Closes apache#8494 from chenghao-intel/weird_cache.
Contributor
There was a problem hiding this comment.
Would it make sense to use transformAllExpressions here instead of enumerating all possible ways of how expressions may occur in the logical plan, or would that produce a different result?
Contributor
There was a problem hiding this comment.
Hmmm, yeah that would probably work.
markhamstra
pushed a commit
to markhamstra/spark
that referenced
this pull request
Nov 3, 2015
…as in its project list
```scala
import org.apache.spark.sql.hive.execution.HiveTableScan
sql("select key, value, key + 1 from src").registerTempTable("abc")
cacheTable("abc")
val sparkPlan = sql(
"""select a.key, b.key, c.key from
|abc a join abc b on a.key=b.key
|join abc c on a.key=c.key""".stripMargin).queryExecution.sparkPlan
assert(sparkPlan.collect { case e: InMemoryColumnarTableScan => e }.size === 3) // failed
assert(sparkPlan.collect { case e: HiveTableScan => e }.size === 0) // failed
```
The actual plan is:
```
== Parsed Logical Plan ==
'Project [unresolvedalias('a.key),unresolvedalias('b.key),unresolvedalias('c.key)]
'Join Inner, Some(('a.key = 'c.key))
'Join Inner, Some(('a.key = 'b.key))
'UnresolvedRelation [abc], Some(a)
'UnresolvedRelation [abc], Some(b)
'UnresolvedRelation [abc], Some(c)
== Analyzed Logical Plan ==
key: int, key: int, key: int
Project [key#14,key#61,key#66]
Join Inner, Some((key#14 = key#66))
Join Inner, Some((key#14 = key#61))
Subquery a
Subquery abc
Project [key#14,value#15,(key#14 + 1) AS _c2#16]
MetastoreRelation default, src, None
Subquery b
Subquery abc
Project [key#61,value#62,(key#61 + 1) AS _c2#58]
MetastoreRelation default, src, None
Subquery c
Subquery abc
Project [key#66,value#67,(key#66 + 1) AS _c2#63]
MetastoreRelation default, src, None
== Optimized Logical Plan ==
Project [key#14,key#61,key#66]
Join Inner, Some((key#14 = key#66))
Project [key#14,key#61]
Join Inner, Some((key#14 = key#61))
Project [key#14]
InMemoryRelation [key#14,value#15,_c2#16], true, 10000, StorageLevel(true, true, false, true, 1), (Project [key#14,value#15,(key#14 + 1) AS _c2#16]), Some(abc)
Project [key#61]
MetastoreRelation default, src, None
Project [key#66]
MetastoreRelation default, src, None
== Physical Plan ==
TungstenProject [key#14,key#61,key#66]
BroadcastHashJoin [key#14], [key#66], BuildRight
TungstenProject [key#14,key#61]
BroadcastHashJoin [key#14], [key#61], BuildRight
ConvertToUnsafe
InMemoryColumnarTableScan [key#14], (InMemoryRelation [key#14,value#15,_c2#16], true, 10000, StorageLevel(true, true, false, true, 1), (Project [key#14,value#15,(key#14 + 1) AS _c2#16]), Some(abc))
ConvertToUnsafe
HiveTableScan [key#61], (MetastoreRelation default, src, None)
ConvertToUnsafe
HiveTableScan [key#66], (MetastoreRelation default, src, None)
```
Author: Cheng Hao <hao.cheng@intel.com>
Closes apache#8494 from chenghao-intel/weird_cache.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
The actual plan is: