Skip to content

Conversation

@wuxianxingkong
Copy link

What changes were proposed in this pull request?

This PR implements the SELECT INTO statement.

The SELECT INTO statement selects data from one table and inserts it into a new table as follows.

SELECT column_name(s)
INTO newtable
FROM table1;

This statement is commonly used in SQL but not currently supported in SparkSQL.
We investigated the Catalyst and found that this statement can be implemented by improving the grammar and reusing the logical plan of CTAS.

The related JIRA is https://issues.apache.org/jira/browse/SPARK-16217

How was this patch tested?

SQLQuerySuite.

@AmplabJenkins
Copy link

Can one of the admins verify this patch?

fromClause?
(WHERE where=booleanExpression)?)
| ((kind=SELECT setQuantifier? namedExpressionSeq fromClause?
| ((kind=SELECT setQuantifier? namedExpressionSeq (intoClause? fromClause)?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, @wuxianxingkong .
Currently, the following seems to be not considered yet. Could you modify the syntax to support this too?

SELECT 1
INTO newtable

Copy link
Author

@wuxianxingkong wuxianxingkong Jul 15, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello, @dongjoon-hyun , thank you for your advice.

SELECT 1 
INTO newtable

This won't work because we need oldtable info to create newtable. So the sql should be

SELECT 1
INTO newtable 
FROM oldtable

The result from my test is: a new table called newtable was created, one column called 1 has the length of oldtable.rows.length and all elements are 1.
Did you mean there is no FROM?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the Spark Shell, please run the followings.

sql("select 1")

Copy link
Author

@wuxianxingkong wuxianxingkong Jul 16, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dongjoon-hyun
At first, I modify grammar:
wrong select into
But it will affect multiInsertQueryBody rule, i.e.:

FROM OLD_TABLE
INSERT INTO T1
SELECT C1
INSERT INTO T2
SELECT C2

The Syntax tree before adding intoClause is:
right tree structure
After adding intoClause ,the tree will be:
wrong tree structure This is because INSERT is a nonreserved keyword and matching strategy of antlr.
One of the ways I can think of is to change grammar like this:
one way
This can solve the problem because antlr parser chooses the alternative specified first.
The grammar can support "SELECT 1 INTO newtable" now.
But this will cause confusion about querySpecification rule because of the duplication. Is there any way to make the syntax less verbose?Thanks.

@dongjoon-hyun
Copy link
Member

Hi, @wuxianxingkong .
Although I'm just a contributor like you, I left a few comments for you because I like your PR.
I hope your PR will be merged soon.

// Add organization statements.
optionalMap(ctx.queryOrganization)(withQueryResultClauses).
// Add insert.
optionalMap(ctx.insertInto())(withInsertInto)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This allows for the following syntax:

INSERT INTO tbl_a
SELECT *
INTO tbl_a
FROM tbl_b

Make sure that we cannot have both.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We also need to check what this does with multi-insert syntax, i.e.:

FROM tbl_a
INSERT INTO tbl_b
SELECT *
INSERT INTO tbl_c
SELECT *
INTO tbl_c

2.Add check in  multiinsertquery syntax:not allow multi insert and select into appear at the same time
3.Add check in singleinsertquery:not allow insert into and select into appear at the same time
*/
protected def withSelectInto(
ctx: IntoClauseContext,
query: LogicalPlan): LogicalPlan = withOrigin(ctx) {
Copy link
Member

@gatorsmile gatorsmile Jun 13, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why throwing a ParseException ?

@gatorsmile
Copy link
Member

@wuxianxingkong Are you still working on this? Thanks!

@gatorsmile
Copy link
Member

We are closing it due to inactivity. please do reopen if you want to push it forward. Thanks!

@asfgit asfgit closed this in b32bd00 Jun 27, 2017
zifeif2 pushed a commit to zifeif2/spark that referenced this pull request Nov 22, 2025
## What changes were proposed in this pull request?

This PR proposes to close stale PRs, mostly the same instances with apache#18017

I believe the author in apache#14807 removed his account.

Closes apache#7075
Closes apache#8927
Closes apache#9202
Closes apache#9366
Closes apache#10861
Closes apache#11420
Closes apache#12356
Closes apache#13028
Closes apache#13506
Closes apache#14191
Closes apache#14198
Closes apache#14330
Closes apache#14807
Closes apache#15839
Closes apache#16225
Closes apache#16685
Closes apache#16692
Closes apache#16995
Closes apache#17181
Closes apache#17211
Closes apache#17235
Closes apache#17237
Closes apache#17248
Closes apache#17341
Closes apache#17708
Closes apache#17716
Closes apache#17721
Closes apache#17937

Added:
Closes apache#14739
Closes apache#17139
Closes apache#17445
Closes apache#18042
Closes apache#18359

Added:
Closes apache#16450
Closes apache#16525
Closes apache#17738

Added:
Closes apache#16458
Closes apache#16508
Closes apache#17714

Added:
Closes apache#17830
Closes apache#14742

## How was this patch tested?

N/A

Author: hyukjinkwon <[email protected]>

Closes apache#18417 from HyukjinKwon/close-stale-pr.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants