Skip to content

Conversation

@jongyoul
Copy link
Member

What is this PR for?

Changed scheduler from FIFO to Parallels in JdbcInterpreter. This is a default behaviour of HiveInterpreter. When we merge all JDBC-like interpreter into JDBC, we need to change default behaviour of JdbcInterpreter.

What type of PR is it?

[Feature]

Todos

  • - Changed scheduler

What is the Jira issue?

How should this be tested?

You can run multiple queries simultaneously.

Screenshots (if appropriate)

Questions:

  • Does the licenses files need update? No
  • Is there breaking changes for older versions? No
  • Does this needs documentation? No

@jongyoul jongyoul closed this Jun 13, 2016
@jongyoul jongyoul reopened this Jun 13, 2016
@prabhjyotsingh
Copy link
Contributor

@jongyoul Thank you for taking care of this. I agree this should be ParallelScheduler. 👍
LGTM.

@Leemoonsoo
Copy link
Member

Leemoonsoo commented Jun 13, 2016

How about make it configurable and set parallel scheduler by default?

Some user might want to run query in parallel, but some might want to avoid run queries in parallel.
For example, scheduler of SparkSql interpreter is configurable through 'zeppelin.spark.concurrentSQL'.

@jongyoul
Copy link
Member Author

@Leemoonsoo I also agree to makes this configurable. I'll follow up this with another PR. I also think we need to replace using getScheduler to another way.

@bzz
Copy link
Member

bzz commented Jun 14, 2016

Looks great, thank you for prompt update!

@jongyoul is there a reason to make it configurable from another PR instead of this one? Just curious

@jongyoul
Copy link
Member Author

@bzz This is because this is related to zeppelin-server. AFAIK, getScheduler is used by zeppelin-server, not zeppelin-interpreter, thus we cannot make it configurable easily. For now, if you make it configurable, we should restart zeppelin-daemon whenever we change the value. I think it's not a good way.

@Leemoonsoo
Copy link
Member

I was thinking just creating multiple interpreter settings if user want to use different scheduler. Then just returning appropriate scheduler from getScheduler() by configuration would be enough. Isn't it?

@jongyoul
Copy link
Member Author

@Leemoonsoo I've missed something. I've known it was not configurable dynamically. Let me check.

@Leemoonsoo
Copy link
Member

Leemoonsoo commented Jun 14, 2016

@jongyoul Right, it's not dynamically configurable. Interpreter need to be restarted to reconfigure.

I think it's possibly related to ZEPPELIN-999 and it's long term plan. Let's say a user want to use jdbc interpreter for hive and mysql. Currently,

A. user can create single interpreter setting and create two configuration for both hive and mysql connection. And select connection via %jdbc(hive) or %jdbc(mysql).
B. or user can create two interpreter setting and each interpreter setting have configuration for hive and mysql connection. In this case, selecting jdbc connection is limited and annoying, because a notebook can not use two or more same type of interpreter settings at the same time. So user have to constantly bind/unbind interpreter setting to switch hive <-> mysql.

If our long term plan is generalize ZEPPELIN-999 and allow user use interpreter setting alias for interpreter selection, like %hive, %mysql, then all disadvantages of approach B will be eliminated.

Managing only single jdbc connection in single interpreter setting, with generalized ZEPPELIN-999 will give advantages, like leverage interpreter authorization. And in this case, i was thinking simple returning appropriate scheduler from getScheduler() by configuration would be enough.

@jongyoul
Copy link
Member Author

@Leemoonsoo I've totally understood your idea and agree with you. I'll patch for it and push it again

@jongyoul
Copy link
Member Author

Merging it into master and branch-0.6

@asfgit asfgit closed this in 5a4aace Jun 20, 2016
asfgit pushed a commit that referenced this pull request Jun 20, 2016
…execution

### What is this PR for?
Changed scheduler from FIFO to Parallels in JdbcInterpreter. This is a default behaviour of HiveInterpreter. When we merge all JDBC-like interpreter into JDBC, we need to change default behaviour of JdbcInterpreter.

### What type of PR is it?
[Feature]

### Todos
* [x] - Changed scheduler

### What is the Jira issue?
* https://issues.apache.org/jira/browse/ZEPPELIN-995

### How should this be tested?
You can run multiple queries simultaneously.

### Screenshots (if appropriate)

### Questions:
* Does the licenses files need update? No
* Is there breaking changes for older versions? No
* Does this needs documentation? No

Author: Jongyoul Lee <[email protected]>

Closes #1005 from jongyoul/ZEPPELIN-995 and squashes the following commits:

af360fa [Jongyoul Lee] Added option to choose which scheduler we use
3bda988 [Jongyoul Lee] Changed scheduler from FIFO to Parallels in JdbcInterpreter

(cherry picked from commit 5a4aace)
Signed-off-by: Jongyoul Lee <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants