Skip to content

Conversation

@zjffdu
Copy link
Contributor

@zjffdu zjffdu commented Aug 28, 2017

What is this PR for?

I didn't intended to make such large change at the beginning, but found many things are coupled together that I have to make such large change. Several suggestions for you how to review and read it.

I move the interpreter package from zeppelin-zengine to zeppelin-interpreter, this is needed for this refactoring.
The overall change is the same as I described in the design doc. I would suggest you to read the unit test first. These unit test is very readable and easy to understand what the code is doing now. InterpreterFactoryTest, InterpreterGroupTest, InterpreterSettingTest, InterpreterSettingManagerTest, RemoteInterpreterTest.
Remove the referent counting logic. Now I will kill the interpreter process as long as all the sessions in the same interpreter group is closed. (I plan to add another kind of policy for the interpreter process lifecycle control, ZEPPELIN-2197)
The RemoteFunction I introduced is for reducing code duplicates when we use RPC.
The changes in Job.java and RemoteScheduler is for fixing the race issue bug. This bug cause the flaky test we see often in ZeppelinSparkClusterTest.pySparkTest
What type of PR is it?

[Bug Fix | Refactoring]

Todos

  • Task
    What is the Jira issue?

https://issues.apache.org/jira/browse/ZEPPELIN-2627
How should this be tested?

Unit test is added

Screenshots (if appropriate)

Questions:

Does the licenses files need update? No
Is there breaking changes for older versions? No
Does this needs documentation? No

@Leemoonsoo
Copy link
Member

I can see some test cases are removed or commented out. for example zeppelin-zengine/src/test/java/org/apache/zeppelin/interpreter/remote/RemoteInterpreterTest.java and some other places.

Is it intended or work in progress?

@zjffdu zjffdu force-pushed the ZEPPELIN-2627-2 branch 9 times, most recently from 6601c80 to b6065d9 Compare August 30, 2017 05:27
tmpDir.mkdirs();
fileChanged = null;
numChanged = 0;
numChanged = new AtomicInteger(0);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fix flaky test

INTERPRETER_SCRIPT, "nonexists", "fakeRepo", new HashMap<String, String>(),
10 * 1000, null, null,"fakeName");
assertFalse(rip.isRunning());
assertEquals(0, rip.referenceCount());
Copy link
Contributor Author

@zjffdu zjffdu Aug 30, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These kinds of testing is not needed any more because of refactoring

private DependencyResolver depResolver;

@Before
public void setUp() throws Exception {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All the test here is covered by InterpreterSettingTest & InterpreterSettingManagerTest

@zjffdu
Copy link
Contributor Author

zjffdu commented Aug 30, 2017

@Leemoonsoo I add more comments for the test case. Some testcase is not needed any more

@zjffdu
Copy link
Contributor Author

zjffdu commented Aug 31, 2017

@Leemoonsoo Do you mind if I commit it first to continue other works ?

@Leemoonsoo
Copy link
Member

Let me test this branch little more with some edge cases.

if (job.isAborted()) {
job.setStatus(Status.ABORT);
} else if (job.getException() != null) {
// logger.info("Job ABORT, " + job.getId());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shell we remove or change to logger.debug?

* @throws Exception
*/
@Test
public void testRestartInterpreterInScopedMode() throws Exception {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@zjffdu Could you point new testcase covers this case?
InterpreterSettingManagerTest looks like have related testcase but somehow restart behavior in scoped mode seems changed. try

  1. set an interpreter scoped mode per note
  2. create two notes and start a paragraph in each note
  3. Restart intepreter from a note in 'interpreter-binding' menu (not in interpreter menu)

Interpreter supposed to be restart for a note only, but it restarts interpreter for both notes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, those tests suppose to ensure restart behavior from note page.
However, if I try this by hands, I'm getting different behavior explained above.

Could you take a look?

Copy link
Contributor Author

@zjffdu zjffdu Sep 2, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Leemoonsoo I believe this is the current behavior of zeppelin, not caused by this PR. Not sure why we would close all interpreter when it is anonymous. See
https://github.com/apache/zeppelin/blob/master/zeppelin-zengine/src/main/java/org/apache/zeppelin/interpreter/InterpreterSettingManager.java#L971
If you feel OK, I will change the behavior in this PR

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right. I have tested master branch with shiro configured and this branch without shiro.
I also not sure why close all interpreter when it is anonymous.

It's up to you change the behavior in this PR or not, but maybe handle it in separate PR for future reference?

* @throws Exception
*/
@Test
public void testRestartInterpreterInIsolatedMode() throws Exception {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just like testRestartInterpreterInScopedMode() restart behavior protected by this unittest has changed.
We'll need to bring this testcase back.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

}

@Test
public void testMultiUser() throws IOException, RepositoryException {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have an equivalent test for this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

}

@Test
public void registerCustomInterpreterRunner() throws IOException {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have an equivalent test for this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.



@Test
public void getEditorSetting() throws IOException, RepositoryException, SchedulerException {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have an equivalent test for this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@zjffdu zjffdu force-pushed the ZEPPELIN-2627-2 branch 2 times, most recently from 3a606ee to b837800 Compare September 1, 2017 11:35
@Leemoonsoo
Copy link
Member

Thanks @zjffdu for great work.
LGTM

@zjffdu
Copy link
Contributor Author

zjffdu commented Sep 2, 2017

Thanks @Leemoonsoo for the review, I will merge it if no more comments. And will do the change in the following PR as there's still remaining work for the refactoring.

@asfgit asfgit closed this in d6203c5 Sep 3, 2017
asfgit pushed a commit that referenced this pull request Oct 18, 2017
### What is this PR for?

Fixed the bug mentioned in #2554 (comment)

### What type of PR is it?
[Bug Fix]

### Todos
* [ ] - Task

### What is the Jira issue?
* https://issues.apache.org/jira/browse/ZEPPELIN-2998

### How should this be tested?
* Unit test is added

### Screenshots (if appropriate)

### Questions:
* Does the licenses files need update? No
* Is there breaking changes for older versions? No
* Does this needs documentation? No

Author: Jeff Zhang <zjffdu@apache.org>

Closes #2626 from zjffdu/ZEPPELIN-2998 and squashes the following commits:

cc11fb6 [Jeff Zhang] ZEPPELIN-2998. Fix bug in restarting interpreter in scoped mode
@prabhjyotsingh
Copy link
Contributor

@zjffdu Just found out an issue which is a side effect of this PR, now I cannot delete an interpreter.
By that what I mean is now say if I delete angular interpreter from the interpreter setting page, and restart zeppelin it gets recreated.

@zjffdu
Copy link
Contributor Author

zjffdu commented Nov 3, 2017

hmm, I see. Because every time I will merge interpreter-setting.json into interpreter-setting.json. Mind to create a ticket ? I will fix it.

@prabhjyotsingh
Copy link
Contributor

Sure here you go ZEPPELIN-3029

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants