Skip to content

Conversation

@xunliu
Copy link
Member

@xunliu xunliu commented Jun 17, 2019

What is this PR for?

In cluster mode, The user creates, modifies, and deletes the note on any of the zeppelin servers.
All need to be notified to all the zeppelin servers in the cluster to synchronize the update of Notebook. Failure to do so will result in the user not being able to continue while switching to another server.

  1. Listen for note create/delete/update events
  2. Listen for note Notebook Authorization events
  3. Broadcast note update event

What type of PR is it?

[Feature]

What is the Jira issue?

How should this be tested?

Screenshots (if appropriate)

Sync note event

syncNote

Sync note Authorization

sync-auth

Questions:

  • Does the licenses files need update?
  • Is there breaking changes for older versions?
  • Does this needs documentation?

@xunliu

This comment has been minimized.

@xunliu
Copy link
Member Author

xunliu commented Jun 21, 2019

@zjffdu , @felixcheung , Please help me review code, Thanks!

Copy link
Member

@felixcheung felixcheung left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can it have race condition?

  1. server #1 renames a note
  2. server #2 deletes the same note

private BiFunction<Address, byte[], byte[]> subscribeClusterEvent = (address, data) -> {
private BiFunction<Address, byte[], byte[]> subscribeClusterIntpEvent = (address, data) -> {
String message = new String(data);
if (LOGGER.isDebugEnabled()) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is if (LOGGER.isDebugEnabled()) { needed? LOGGER.debug should be enough?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because need to synchronize the create, rename, delete, set read permissions, write permissions, execute permissions, so need to synchronize the more information, there will be a lot of synchronization information in the log file.
So need add if (LOGGER.isDebugEnabled())

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I understand. you are saying logging is useful for debugging, and there could be a lot of information.
isn't debug log will be filtered out if debug log is not enabled?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the development phase, the debug information is very useful for developers, but after the development is completed, the debug information will be very much. In the log log file, there is a lot of synchronization information, which affects the system administrator to check whether the system is normal.
So, I think these debug messages don't need to be printed to the log file after development. :-)

String jsonSet = message.get("set");
Gson gson = new Gson();
Set<String> set = gson.fromJson(jsonSet, new TypeToken<Set<String>>() {
}.getType());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can this be joined to the line before? formatting is a bit strange.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Modified to this?

Set<String> set = gson.fromJson(jsonSet, new TypeToken<Set<String>>() {}.getType());

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seem that way. new TypeToken<Set<String>>() {} (together) seems to be how it is used.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I am going to change it.

@xunliu
Copy link
Member Author

xunliu commented Jun 22, 2019

Hi, @felixcheung , This is a good question. #3387 (review)

  1. server [ZEPPELIN-4] Sync the version of dependencies with Spark #1 renames a note
  2. server minor doc update for running on YARN #2 deletes the same note

If rename is effective, the deletion will fail.
If you delete the gentleman, then rename will fail.

@felixcheung
Copy link
Member

sure, that's assuming rename or delete is atomic? it could be in the middle of one and both operations failing and leading to data loss etc?

for example, it does look like if there are two move or rename concurrently there is a chance the note will be duplicated

https://github.com/apache/zeppelin/blob/master/zeppelin-zengine/src/main/java/org/apache/zeppelin/notebook/NoteManager.java#L210

server1

    NoteNode noteNode = getNoteNode(notePath);

server2

    NoteNode noteNode = getNoteNode(notePath);

then both execute these concurrently

    noteNode.getParent().removeNote(getNoteName(notePath));  // not sure removeNote will fail
    noteNode.setNotePath(newNotePath);
    String newParent = getFolderName(newNotePath);
    Folder newFolder = getOrCreateFolder(newParent);
    newFolder.addNoteNode(noteNode);

@xunliu
Copy link
Member Author

xunliu commented Jun 23, 2019

@felixcheung , Thank you for asking this question. :-)
I think that implementing atomicity in a distributed system is a big issue.
I think, after the distributed features of zeppelin go online, After zeppelin has more user groups, Think about it again. Are you saying ok?
Zeppelin's distributed clustering model is used by a large number of analysts in our company every day. There have been no cases where different analysts are working on the same note.
Because notes are generally not only shared with members of the group. There won't be many users working on the same note at the same time.

@xunliu
Copy link
Member Author

xunliu commented Jun 24, 2019

Will merge if no more comments

@asfgit asfgit closed this in 0845d35 Jun 24, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants