Skip to content

Conversation

@dongjoon-hyun
Copy link
Member

What changes were proposed in this pull request?

This PR implements inline table generating function.

How was this patch tested?

Pass the Jenkins tests with new testcase.

@SparkQA
Copy link

SparkQA commented Jun 29, 2016

Test build #61479 has finished for PR 13976 at commit 1add31a.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • case class Inline(child: Expression) extends UnaryExpression with Generator with CodegenFallback

@dongjoon-hyun
Copy link
Member Author

The failure seems to be irrelevant to this PR.

Exception encountered when attempting to run a suite with class name: org.apache.spark.sql.jdbc.JDBCWriteSuite *** ABORTED *** (886 milliseconds)
[info]   org.h2.jdbc.JdbcSQLException: Schema "TEST" already exists; SQL statement:

@dongjoon-hyun
Copy link
Member Author

Retest this please.

@SparkQA
Copy link

SparkQA commented Jun 29, 2016

Test build #61494 has finished for PR 13976 at commit 1add31a.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • case class Inline(child: Expression) extends UnaryExpression with Generator with CodegenFallback

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

give an example in extended?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure.

@SparkQA
Copy link

SparkQA commented Jun 30, 2016

Test build #61542 has finished for PR 13976 at commit e8d284c.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@dongjoon-hyun
Copy link
Member Author

Hi, @rxin .

I rebased and updated according to the previous PosExplode PR.
By the way, I didn't register into sql/Dataset.scala and python/R since I was confused at map_keys/map_values in #13967.

If you give an advice, I will proceed this PR (inline) and #13967 (map_keys/map_values) further.

@rxin
Copy link
Contributor

rxin commented Jun 30, 2016

looks pretty good. I will let @cloud-fan review and do the merge.

@dongjoon-hyun
Copy link
Member Author

Yep. Thank you, @rxin .

@SparkQA
Copy link

SparkQA commented Jun 30, 2016

Test build #61562 has finished for PR 13976 at commit be408ac.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how about

StructType(et.fields.zipWithIndex.map {
  case (field, index) => StructField(s"col{$index + 1}", field.dataType, nullable = field.nullable)
})

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually that's the best !

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. Thanks, @cloud-fan !

@dongjoon-hyun
Copy link
Member Author

Thank you, @cloud-fan . :) I've learn a lot in this PR again.
The followings are applied.

  • Use inputArray.getStruct(i, ncol)
  • Keep the original field name
  • Fix elementSchema generation style
  • Add a column-based test, Array() expression-level test, add empty row test.

@SparkQA
Copy link

SparkQA commented Jul 1, 2016

Test build #61615 has finished for PR 13976 at commit 9382f64.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • case class Inline(child: Expression) extends UnaryExpression with Generator with CodegenFallback


override def elementSchema: StructType = child.dataType match {
case ArrayType(et : StructType, _) =>
StructType(et.fields.zipWithIndex.map {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm, so it's just et now?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep. Currently, our type checker ensures that homogeneous StructType array.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not return et directly?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, my god. I was too naive, here.
Thank you!

@dongjoon-hyun
Copy link
Member Author

Thank you, @cloud-fan .
I renovated ugly testcases with create_row()!

@SparkQA
Copy link

SparkQA commented Jul 1, 2016

Test build #61622 has started for PR 13976 at commit c43a187.

@shaneknapp
Copy link
Contributor

jenkins, test this please

@dongjoon-hyun
Copy link
Member Author

Thank you, @shaneknapp ! :)

@shaneknapp
Copy link
Contributor

no problem... sorry i had to kill it, but i wanted to get a quick jenkins restart in. :)

@SparkQA
Copy link

SparkQA commented Jul 1, 2016

Test build #61624 has finished for PR 13976 at commit c43a187.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Row(3) :: Nil)
}

test("inline raises exception on empty array") {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

on array of null type?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep. That's more clear.

@dongjoon-hyun
Copy link
Member Author

Thank you, @cloud-fan .
elementSchema is simplified and the testcase name is updated.

@SparkQA
Copy link

SparkQA commented Jul 2, 2016

Test build #61647 has finished for PR 13976 at commit ebd9cfa.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@dongjoon-hyun
Copy link
Member Author

Oh, I found a bug and am working on this.

@dongjoon-hyun
Copy link
Member Author

Actually, that is not a bug, but I found that there exists a little difference between Spark and Hive with the following query.

SELECT inline(array(struct(a), struct(b))) FROM (SELECT 1 a, 2 b) T

In short, Spark does more strict type-checking, e.g., [struct<a:int>, struct<b:int>] is considered heterogeneous due to name difference.

I only add more tests to clarify the cases. We cannot touch that because it depends on many things.

The following query is a workaround which both Spark/Hive work.

SELECT inline(array(struct(a), named_struct('a', b))) FROM (SELECT 1 a, 2 b) T

@SparkQA
Copy link

SparkQA commented Jul 3, 2016

Test build #61669 has finished for PR 13976 at commit 31ffa75.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jul 3, 2016

Test build #61670 has finished for PR 13976 at commit fed3ba2.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jul 3, 2016

Test build #61676 has finished for PR 13976 at commit e260359.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Nil
} else {
for (i <- 0 until inputArray.numElements())
yield inputArray.getStruct(i, numFields)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure how is the performance of for-yield, maybe it's safe to create an array manually and use while loop here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, @cloud-fan . By the way, for about this, @rxin gave me an advice at the first commit of this PR.

we don't need to materialize the array, do we? We can create an iterator to return the results.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah i see, for-yield returns an iterator.

@cloud-fan
Copy link
Contributor

merging to master, thanks!

@asfgit asfgit closed this in 88134e7 Jul 3, 2016
@dongjoon-hyun
Copy link
Member Author

Thank you, @cloud-fan and @rxin ! :)

asfgit pushed a commit that referenced this pull request Jul 8, 2016
This PR implements `inline` table generating function.

Pass the Jenkins tests with new testcase.

Author: Dongjoon Hyun <[email protected]>

Closes #13976 from dongjoon-hyun/SPARK-16288.

(cherry picked from commit 88134e7)
Signed-off-by: Reynold Xin <[email protected]>
@dongjoon-hyun dongjoon-hyun deleted the SPARK-16288 branch July 20, 2016 07:42
LuciferYang pushed a commit that referenced this pull request May 28, 2024
### What changes were proposed in this pull request?
The pr aims to upgrade `netty` from `4.1.109.Final` to `4.1.110.Final`.

### Why are the changes needed?
- https://netty.io/news/2024/05/22/4-1-110-Final.html
  This version has brought some bug fixes and improvements, such as:
  Fix Zstd throws Exception on read-only volumes (netty/netty#13982)
  Add unix domain socket transport in netty 4.x via JDK16+ ([#13965](netty/netty#13965))
  Backport #13075: Add the AdaptivePoolingAllocator ([#13976](netty/netty#13976))
  Add no-value key handling only for form body ([#13998](netty/netty#13998))
  Add support for specifying SecureRandom in SSLContext initialization ([#14058](netty/netty#14058))

- https://github.com/netty/netty/issues?q=milestone%3A4.1.110.Final+is%3Aclosed

### Does this PR introduce _any_ user-facing change?
No.

### How was this patch tested?
Pass GA.

### Was this patch authored or co-authored using generative AI tooling?
No.

Closes #46744 from panbingkun/SPARK-48420.

Authored-by: panbingkun <[email protected]>
Signed-off-by: yangjie01 <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants