ZEPPELIN-1423 - Allow users to specify pre/post-execute hooks for interpreters #1470

agoodm · 2016-09-29T00:52:04Z

What is this PR for?

See #1439. This is a second attempt based on prior feedback, particularly from @Leemoonsoo who advised that it would be better to contain all of the code inside of the Interpreter Process. Aside from this, the major changes since the previous PR are:

The terminology "hooks" is used instead of "callbacks". Since the information being passed to the interpreter isn't actually a "callback", strictly speaking, the more general term "hook" is probably a better choice. All variable / class names were updated correspondingly.
The registry is now keyed by class name rather than the replName specified by the magic on top of the paragraph. The interface provided by ZeppelinContext allows for a replName to be converted to className through an internal mapping.
Two new event codes, PRE_EXEC_DEV and POST_EXEC_DEV have been added. This is primarily intended to separate use cases for the interpreter maintainers (via subclasses of Interpreter) and the notebook users (via the ZeppelinContext class), as otherwise user attempts at registering a hook would overwrite that set by the interpreter maintainer.
Global scope for hook registration is supported for the developer use cases.

What type of PR is it?

New Feature

What is the Jira issue?

ZEPPELIN-1423

How should this be tested?

In a new note, add the following lines of code to a paragraph:

%pyspark
z.registerHook("post_exec", "print 'This code should be executed before the paragraph code!'")
z.registerHook("pre_exec", "print 'This code should be executed after the paragraph code!'")

Then run any other paragraph in the note containing some other code, eg

%pyspark
print "This code should be entered into the paragraph by the user!"

The output should be:

This code should be executed before the paragraph code!
This code should be entered into the paragraph by the user!
This code should be executed after the paragraph code!

You should also test out the other two methods (getCallback() and unregisterCallback()) specified in ZeppelinContext.java.

One final caveat that should be mentioned: If there are errors in the code you specify for a pre-execute event, it will render the interpreter useless since the current implementation prepends the the code specified in pre_exec directly to the paragraph entered code before calling interpret(). The current workaround for this would be to either restart the interpreter group or call unregisterHook() via a different REPL within the interpreter group (eg, z.unregisterHook("pre_exec", "pyspark") from the spark interpreter). I would appreciate if anyone here would be willing to share any better approaches here.

Questions:

Does the licenses files need update? No
Is there breaking changes for older versions? No
Does this needs documentation? Yes

…preters

agoodm · 2016-09-29T00:55:23Z

@Leemoonsoo please review, thanks.

Leemoonsoo

@agoodm Thanks for the contribution. I've left some comments. Please take a look and let me know how you think.

Leemoonsoo · 2016-10-03T03:33:57Z

zeppelin-interpreter/src/main/java/org/apache/zeppelin/interpreter/Interpreter.java

+   * @param cmd The code to be executed by the interpreter on given event
+   */
+  @Experimental
+  public void registerHook(String noteId, String event, String cmd) {


InterpreterHookRegistry is directly passed to ZeppelinContext on it's constructor. And ZeppelinContext is exposed to user directly. So currently Interpreter.registerHook(),Interpreter.unregiserHook(),Interpreter.getHook() are accessed from only RemoteInterpreterServer. But RemoteInterpreterServer also can get directly access InterpreterHookRegistry from InterpreterGroup. In this way, i think the feature can be implemented without modification of Interpreter.java. What do you think?

I kind of see your point, moving the convenience methods from Interpreter.java to InterpreterGroup.java would save the need for defining wrappers in LazyOpenInterpreter.java. However this would also mean that the interpreter className must always be specified. This means for example that this line:

String cmdDev = interpreter.getHook(noteId, HookType.PRE_EXEC_DEV);

Would be changed to

String className = interpreter.getClassName(); String cmdDev = interpreter.getInterpreterGroup().getHook(noteId, className, HookType.PRE_EXEC_DEV);

Additionally while these methods are only currently called through RemoteInterpreterServer, I will have use cases for directly calling them in the Python Interpreter subclasses (eg, PySparkInterpreter and PythonInterpreter) for my upcoming matplotlib integration work. Given this, I felt that the current syntax would be clearer.

Leemoonsoo · 2016-10-03T03:38:09Z

zeppelin-interpreter/src/main/java/org/apache/zeppelin/interpreter/InterpreterContext.java

    return runners;
  }

+  public String getRequiredReplName() {


Since Zeppelin allows use alias to select interpreter (see, https://issues.apache.org/jira/browse/ZEPPELIN-1012), this method will not only return "spark", "pyspark", etc but also any arbitrary alias. So, this value can be the key not exists in interpreterClassMap.

I forgot about this, this definitely sounds like it would be a problem. I was struggling to figure out ways to automatically detect the current interpreter class through ZeppelinContext since I thought that would be rather convenient for users, but I couldn't come up with better ideas at the time.

I think this problem can be partially addressed by passing in the interpreter class name through the current InterpreterContext instance as that could solve the problem for automatically detecting the current interpreter that's in use. However this still wouldn't allow users to manually specify aliases for another interpreter that isn't keyed into the internal mapping. Let me know what you think of this solution and if you have any other ideas.

agoodm · 2016-10-03T06:47:36Z

@Leemoonsoo Thanks for the review, I have provided some responses.

agoodm · 2016-10-06T22:58:16Z

@Leemoonsoo I have addressed your second comment, let me know if this looks good to merge into master.

Leemoonsoo · 2016-10-15T22:55:08Z

Apologies for late response. Changes LGTM.
Last CI failure is irrelevant to this change.
Merge if there're no more discussion.

Thanks @agoodm for making step forward to ZEPPELIN-1344!

…erpreters ### What is this PR for? See apache#1439. This is a second attempt based on prior feedback, particularly from Leemoonsoo who advised that it would be better to contain all of the code inside of the Interpreter Process. Aside from this, the major changes since the previous PR are: - The terminology "hooks" is used instead of "callbacks". Since the information being passed to the interpreter isn't actually a "callback", strictly speaking, the more general term "hook" is probably a better choice. All variable / class names were updated correspondingly. - The registry is now keyed by class name rather than the `replName` specified by the magic on top of the paragraph. The interface provided by `ZeppelinContext` allows for a `replName` to be converted to `className` through an internal mapping. - Two new event codes, `PRE_EXEC_DEV` and `POST_EXEC_DEV` have been added. This is primarily intended to separate use cases for the interpreter maintainers (via subclasses of `Interpreter`) and the notebook users (via the `ZeppelinContext` class), as otherwise user attempts at registering a hook would overwrite that set by the interpreter maintainer. - Global scope for hook registration is supported for the developer use cases. ### What type of PR is it? New Feature ### What is the Jira issue? [ZEPPELIN-1423](https://issues.apache.org/jira/browse/ZEPPELIN-1423) ### How should this be tested? In a new note, add the following lines of code to a paragraph: ```python %pyspark z.registerHook("post_exec", "print 'This code should be executed before the paragraph code!'") z.registerHook("pre_exec", "print 'This code should be executed after the paragraph code!'") ``` Then run any other paragraph in the note containing some other code, eg ```python %pyspark print "This code should be entered into the paragraph by the user!" ``` The output should be: ``` This code should be executed before the paragraph code! This code should be entered into the paragraph by the user! This code should be executed after the paragraph code! ``` You should also test out the other two methods (`getCallback()` and `unregisterCallback()`) specified in `ZeppelinContext.java`. One final caveat that should be mentioned: If there are errors in the code you specify for a pre-execute event, it will render the interpreter useless since the current implementation prepends the the code specified in `pre_exec` directly to the paragraph entered code before calling `interpret()`. The current workaround for this would be to either restart the interpreter group or call `unregisterHook()` via a different REPL within the interpreter group (eg, `z.unregisterHook("pre_exec", "pyspark")` from the spark interpreter). I would appreciate if anyone here would be willing to share any better approaches here. ### Questions: * Does the licenses files need update? No * Is there breaking changes for older versions? No * Does this needs documentation? Yes Author: Alex Goodman <[email protected]> Closes apache#1470 from agoodm/ZEPPELIN-1423v2 and squashes the following commits: 56ede60 [Alex Goodman] Automatically detect default interpreter for registerHook() 044a99d [Alex Goodman] Ensure that registered hooks are applied after call to open() 1331fe1 [Alex Goodman] Update interpreters.md 07cac65 [Alex Goodman] Implemented user-defined hook registry system for spark/pyspark interpreters 8fad936 [Alex Goodman] Added Interpreter Hooks to Interpreter Process

…ter overview page ### What is this PR for? After #1470 merged, "(Experimental) Interpreter Execution Hooks" is added under https://zeppelin.apache.org/docs/0.7.0-SNAPSHOT/manual/interpreters.html page. But i think we need to keep this page as simple as possible since it's explaining the basic concept of Zeppelin interpreters. So I separated "(Experimental) Interpreter Execution Hooks" section from `interpreters.md` and created new page `interpreterexechooks.md`. And also fixed some markdown rendering issues. ### What type of PR is it? Documentation ### What is the Jira issue? N/A ### How should this be tested? Please see below screenshots :) ### Screenshots (if appropriate) - Before (under [manual/interpreter.md](https://zeppelin.apache.org/docs/0.7.0-SNAPSHOT/manual/interpreters.html#experimental-interpreter-execution-hooks) <img width="437" alt="screen shot 2016-12-15 at 5 05 49 pm]" src="https://cloud.githubusercontent.com/assets/10060731/21216362/a3bb89f0-c2e9-11e6-9678-8e6d8749229b.png"> - After <img width="300" alt="dropdown" src="https://cloud.githubusercontent.com/assets/10060731/21216570/dcca63f0-c2ea-11e6-90a9-969d363b423a.png"> <img width="437" alt="screen shot 2016-12-15 at 5 05 58 pm" src="https://cloud.githubusercontent.com/assets/10060731/21216363/a6c2d82e-c2e9-11e6-920c-a603e25e1699.png"> ### Questions: * Does the licenses files need update? no * Is there breaking changes for older versions? no * Does this needs documentation? no Author: AhyoungRyu <[email protected]> Closes #1768 from AhyoungRyu/separate-eventhook-section and squashes the following commits: ce19491 [AhyoungRyu] Separate 'interpreter exec hooks' from intp overview

agoodm added 3 commits September 28, 2016 12:27

Added Interpreter Hooks to Interpreter Process

8fad936

Implemented user-defined hook registry system for spark/pyspark inter…

07cac65

…preters

Update interpreters.md

1331fe1

Leemoonsoo reviewed Oct 3, 2016

View reviewed changes

agoodm added 2 commits October 4, 2016 17:49

Ensure that registered hooks are applied after call to open()

044a99d

Automatically detect default interpreter for registerHook()

56ede60

asfgit closed this in c67bd6d Oct 17, 2016

agoodm deleted the ZEPPELIN-1423v2 branch October 18, 2016 07:30

Leemoonsoo mentioned this pull request Nov 4, 2016

ZEPPELIN-1616. Interpreter open happens in jobRun #1596

Closed

1 task

AhyoungRyu mentioned this pull request Dec 15, 2016

[DOCS] Separate "interpreter exec hooks (experimental)" from interpreter overview page #1768

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ZEPPELIN-1423 - Allow users to specify pre/post-execute hooks for interpreters #1470

ZEPPELIN-1423 - Allow users to specify pre/post-execute hooks for interpreters #1470

Uh oh!

agoodm commented Sep 29, 2016 •

edited

Loading

Uh oh!

agoodm commented Sep 29, 2016

Uh oh!

Leemoonsoo left a comment

Uh oh!

Leemoonsoo Oct 3, 2016

Uh oh!

agoodm Oct 3, 2016

Uh oh!

Leemoonsoo Oct 3, 2016 •

edited

Loading

Uh oh!

agoodm Oct 3, 2016 •

edited

Loading

Uh oh!

agoodm commented Oct 3, 2016

Uh oh!

agoodm commented Oct 6, 2016

Uh oh!

Leemoonsoo commented Oct 15, 2016

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

ZEPPELIN-1423 - Allow users to specify pre/post-execute hooks for interpreters #1470

ZEPPELIN-1423 - Allow users to specify pre/post-execute hooks for interpreters #1470

Uh oh!

Conversation

agoodm commented Sep 29, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What is this PR for?

What type of PR is it?

What is the Jira issue?

How should this be tested?

Questions:

Uh oh!

agoodm commented Sep 29, 2016

Uh oh!

Leemoonsoo left a comment

Choose a reason for hiding this comment

Uh oh!

Leemoonsoo Oct 3, 2016

Choose a reason for hiding this comment

Uh oh!

agoodm Oct 3, 2016

Choose a reason for hiding this comment

Uh oh!

Leemoonsoo Oct 3, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

agoodm Oct 3, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

agoodm commented Oct 3, 2016

Uh oh!

agoodm commented Oct 6, 2016

Uh oh!

Leemoonsoo commented Oct 15, 2016

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

agoodm commented Sep 29, 2016 •

edited

Loading

Leemoonsoo Oct 3, 2016 •

edited

Loading

agoodm Oct 3, 2016 •

edited

Loading