Skip to content

Conversation

@zjffdu
Copy link
Contributor

@zjffdu zjffdu commented Nov 4, 2016

What is this PR for?

Not sure why we put interpreter open in jobRun, that means the thrift service method open is never used. I think open method in remote interpreter process should be called when RemoteInterpreter call open method in client side.

What type of PR is it?

[Improvement]

Todos

  • - Task

What is the Jira issue?

How should this be tested?

Outline the steps to test the PR here.

Screenshots (if appropriate)

Questions:

  • Does the licenses files need update? No
  • Is there breaking changes for older versions? No
  • Does this needs documentation? No

@zjffdu
Copy link
Contributor Author

zjffdu commented Nov 4, 2016

@Leemoonsoo Let me know if this is correct, or I may miss something.

@Leemoonsoo
Copy link
Member

Leemoonsoo commented Nov 4, 2016

Interpreter open in jobRun introduced in #1470 by @agoodm, to enable #1534 and will not work with this patch applied.

@agoodm @zjffdu what do you think? Shell we keep open in jobRun or find other way to do?

@agoodm
Copy link
Member

agoodm commented Nov 4, 2016

@Leemoonsoo @zjffdu I placed the extra call to open() in jobRun because I wanted to have a convenient place to make calls to registerHook(). We can't do this in the Interpreter constructor since the InterpreterGroup (which stores the hook registry) will not necessarily be set yet. The only other way I can think of doing this without another rewrite of the hook system would be to instead override the setInterpreterGroup() method to call registerHook() after setting the InterpreterGroup. Not very elegant, but it will probably get the job done (pun not intended). Actually, it might be the better approach to what we are doing now since we would no longer have to worry about checking for null pointers.

That being said, could you explain why this change is in itself necessary? IIRC, the client.open() call you have added to RemoteInterpreter.java was not there before any of the changes I made.

@zjffdu
Copy link
Contributor Author

zjffdu commented Nov 4, 2016

Thanks @agoodm for the explanation. Actually this PR would call open method in RemoteInterpreterServer.open rather than in constructor, 2 reasons

  • In semantics perspective, it should be called in RemoteInterpreterServer.open rather than in jobRun, and we don't need to check whether interperer is open in jobRun every time, open should be triggered by client rather than server.
  • For another PR, I'd like to pass extra info through open method from client to server.

IIRC, you just require InterpreterGroup been setup before registerHook, right ? Then I think this PR would not affect that. After the PR, the sequences will be

  • createInterpreter (InterpreterGroup will also be created here)
  • open interpreter
  • interpret paragraph (jobRun)

@agoodm
Copy link
Member

agoodm commented Nov 4, 2016

Well to be precise, the other reason I made the call to open() was to make sure the hook registry was in effect before the first call to interpret() from within jobRun(), and therefore, the first attempt at running a paragraph with that interpreter. Before I made that change, open() was not being called until after interpret(). I think it should be fine based on what you said and if it doesn't work I am fairly sure the other solution I just mentioned would be a good alternative. I can test it on my PR branch tomorrow.

@zjffdu
Copy link
Contributor Author

zjffdu commented Nov 4, 2016

open() will be called before the first call to interpreter(), so it won't affect. Will wait for your test result tomorrow.

@agoodm
Copy link
Member

agoodm commented Nov 4, 2016

@zjffdu Please see the most recent comments @Leemoonsoo and I made in #1534. With the vanilla python interpreter, I can indeed confirm that the call to open() is being made at the right time. However, we now can't execute %pyspark paragraphs without running %spark paragraphs first. If we do so, we get this stacktrace:

org.apache.thrift.transport.TTransportException
    at org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:132)
    at org.apache.thrift.transport.TTransport.readAll(TTransport.java:86)
    at org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:429)
    at org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:318)
    at org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:219)
    at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:69)
    at org.apache.zeppelin.interpreter.thrift.RemoteInterpreterService$Client.recv_open(RemoteInterpreterService.java:217)
    at org.apache.zeppelin.interpreter.thrift.RemoteInterpreterService$Client.open(RemoteInterpreterService.java:203)
    at org.apache.zeppelin.interpreter.remote.RemoteInterpreter.init(RemoteInterpreter.java:230)
    at org.apache.zeppelin.interpreter.remote.RemoteInterpreter.getFormType(RemoteInterpreter.java:387)
    at org.apache.zeppelin.interpreter.LazyOpenInterpreter.getFormType(LazyOpenInterpreter.java:105)
    at org.apache.zeppelin.notebook.Paragraph.jobRun(Paragraph.java:313)
    at org.apache.zeppelin.scheduler.Job.run(Job.java:176)
    at org.apache.zeppelin.scheduler.RemoteScheduler$JobRunner.run(RemoteScheduler.java:329)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
    at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)

In addition to what @Leemoonsoo had already shown from the log file.

@zjffdu
Copy link
Contributor Author

zjffdu commented Nov 7, 2016

@agoodm @Leemoonsoo I found the issue, this is due that to PySparkInterpreter depends on SparkInterpreter. So before opening PySparkInterpreter we should creating SparkInterpreter. But seems there's no mechanism to specify the dependency relationship between interpreters, so In RemoteInterpreter, we did a hack IMO, we would create all the interpreters of one session together even we may only use one interpreter. I update RemoteInterpreter to make it work although it is still a hack. In the long term, I think we can specify the dependency relationship between interpreters, so that we can control which interpreter to create/open first, and we don't need to create unnecessary interpreters, because for now IIUC, we would create all the interpreter in one session even we use just one of them.

@zjffdu
Copy link
Contributor Author

zjffdu commented Nov 7, 2016

Think about it more, I don't have strong preference on this PR. As name LazyOpenInterpreter seems to imply it would call open method when interpret method is called. My purpose is to try to pass extra info from client to server. I can also do it by changing signature of open() to open(InterpreterContext) although it is a little weird to pass InterpreterContext to open(), @Leemoonsoo What do you think ?

@Leemoonsoo
Copy link
Member

Leemoonsoo commented Nov 7, 2016

I think cleaner way to remove open() inside of jobRun() could be call open() inside of LazyOpenInterpreter.getHook() like LazyOpenInterpreter.interpret(), LazyOpenInterpreter.cancel() and other methods do.

Regarding create all interpreters once in the same session,
Like you already mentioned, it'll make unnecessary interpreter creation.
Some of interpreters in interpreter group may not work by many reasons (configuration, system env, external commands, etc).

I would say Interpreter.getInterpreterInTheSameSessionByClassName() is a sort of API that manages dependencies of Interpreter. The api returns LazyInterpreter of another interpreter in the same group. So, interpreter supposed to simply get reference to the other interpreter and use it without explicitly defining dependency.

@zjffdu
Copy link
Contributor Author

zjffdu commented Nov 23, 2016

@Leemoonsoo I think putting open() in jobRun or hook is the same. Because hooks are also executed in jobRun(). Actually my intention is to pass AuthenticationInfo to open(), so that livy can create this user's session.

@zjffdu
Copy link
Contributor Author

zjffdu commented Mar 10, 2017

close this

@zjffdu zjffdu closed this Mar 10, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants