Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update jupyter_ydoc and pycrdt_websocket dependencies #367

Open
wants to merge 7 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
58 changes: 31 additions & 27 deletions projects/jupyter-server-ydoc/jupyter_server_ydoc/handlers.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,14 +8,14 @@
import time
import uuid
from logging import Logger
from typing import Any
from typing import Any, Literal
from uuid import uuid4

from jupyter_server.auth import authorized
from jupyter_server.base.handlers import APIHandler, JupyterHandler
from jupyter_server.utils import ensure_async
from jupyter_ydoc import ydocs as YDOCS
from pycrdt import Doc, UndoManager, YMessageType, write_var_uint
from pycrdt import Doc, UndoManager, write_var_uint
from pycrdt_websocket.websocket_server import YRoom
from pycrdt_websocket.ystore import BaseYStore
from tornado import web
Expand Down Expand Up @@ -137,6 +137,10 @@ def exception_logger(exception: Exception, log: Logger) -> bool:
exception_handler=exception_logger,
)

if self._room_id == "JupyterLab:globalAwareness":
# Listen for the changes in GlobalAwareness to update users
self.room.awareness.observe(self._on_global_awareness_event)

try:
await self._websocket_server.start_room(self.room)
except Exception as e:
Expand Down Expand Up @@ -286,31 +290,6 @@ async def on_message(self, message):
"""
message_type = message[0]

if message_type == YMessageType.AWARENESS:
# awareness
skip = False
changes = self.room.awareness.get_changes(message[1:])
added_users = changes["added"]
removed_users = changes["removed"]
for i, user in enumerate(added_users):
u = changes["states"][i]
if "user" in u:
name = u["user"]["name"]
self._websocket_server.connected_users[user] = name
self.log.debug("Y user joined: %s", name)
for user in removed_users:
if user in self._websocket_server.connected_users:
name = self._websocket_server.connected_users[user]
del self._websocket_server.connected_users[user]
self.log.debug("Y user left: %s", name)
# filter out message depending on changes
if skip:
self.log.debug(
"Filtered out Y message of type: %s",
YMessageType(message_type).name,
)
return skip

if message_type == MessageType.CHAT:
msg = message[2:].decode("utf-8")

Expand Down Expand Up @@ -405,6 +384,31 @@ async def _clean_room(self) -> None:
self._emit(LogLevel.INFO, "clean", "Loader deleted.")
del self._room_locks[self._room_id]

def _on_global_awareness_event(
self, topic: Literal["change", "update"], changes: tuple[dict[str, Any], Any]
) -> None:
"""
Update the users when the global awareness changes.

Parameters:
topic (str): `"update"` or `"change"` (`"change"` is triggered only if the states are modified).
changes (tuple[dict[str, Any], Any]): The changes and the origin of the changes.
"""
if topic != "change":
return
added_users = changes[0]["added"]
removed_users = changes[0]["removed"]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about "updated" users?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually I wonder if the connected_users of the _websocket_server is even used.
I copied that code from the previous message handler, but I don't know if we need that function.
If we keep it, we should indeed handle the "updated" users (removing the former name and add the new one).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we keep it, we should indeed handle the "updated" users (removing the former name and add the new one).

Thinking again about that, we don't have the previous name, the changes contain only the client ids.
We should rebuild the full list from the global awareness when a user is updated, because we don't know if the user name has been updated.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, the goal of this code was to check in the backend that the awareness information from the frontend is correct. For instance, we don't want a student to take the user name of the teacher. That's why there is this skip variable that was supposed to filter out an awareness message, but this was never put to actual use.
I'm wondering how we can filter out a message with your changes though?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know if we can filter out a message with the current information we have.
For example when you reload the page, it adds a new client id with the same user information. The old client id is removed in a second step (probably when a client has a lack of update from it ?).
The same user is duplicated over a period of time, with 2 different client IDs. If we allow this behavior, I don't know how we can filter out someone trying to cheat.

The following image shows some logs when a remote client reloads the page. When a change is received, the change and the current users in the awareness are printed.
Screenshot from 2024-10-15 16-10-45

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The backend is the authority for user identities, and the frontend can get its identity at /api/me, so we should be able to check that they match.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that it's out of scope for this PR, but I'm wondering if we still get the possibility to discard an awareness update?

for user in added_users:
u = self.room.awareness.states[user]
if "user" in u:
name = u["user"]["name"]
self._websocket_server.connected_users[user] = name
self.log.debug("Y user joined: %s", name)
for user in removed_users:
if user in self._websocket_server.connected_users:
name = self._websocket_server.connected_users.pop(user)
self.log.debug("Y user left: %s", name)

def check_origin(self, origin):
"""
Check origin
Expand Down
2 changes: 1 addition & 1 deletion projects/jupyter-server-ydoc/jupyter_server_ydoc/rooms.py
Original file line number Diff line number Diff line change
Expand Up @@ -41,7 +41,7 @@ def __init__(
self._file_format: str = file_format
self._file_type: str = file_type
self._file: FileLoader = file
self._document = YDOCS.get(self._file_type, YFILE)(self.ydoc)
self._document = YDOCS.get(self._file_type, YFILE)(self.ydoc, self.awareness)
self._document.path = self._file.path

self._logger = logger
Expand Down
4 changes: 2 additions & 2 deletions projects/jupyter-server-ydoc/pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -29,9 +29,9 @@ authors = [
]
dependencies = [
"jupyter_server>=2.11.1,<3.0.0",
"jupyter_ydoc>=2.0.0,<4.0.0",
"jupyter_ydoc>=2.1.2,<4.0.0",
"pycrdt",
"pycrdt-websocket>=0.14.2,<0.15.0",
"pycrdt-websocket>=0.15.0,<0.16.0",
"jupyter_events>=0.10.0",
"jupyter_server_fileid>=0.7.0,<1",
"jsonschema>=4.18.0"
Expand Down
Loading