-
Notifications
You must be signed in to change notification settings - Fork 5.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Jupyter Code Executor in v0.4 (alternative implementation) #4885
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks great! I agree, this feels like the better route to take. Simpler, more robust and offloads the complexity of processing the notebook to a library built for the task!
@jackgerrits Updated the code, I was missing some local changes that I had done in nbclient. But there is some memory leak somewhere. When I run my program, I can see memory grow extremely. After 40min it was multiple GB. I run 10 tasks in parallel, where each task is a dialog with up to 20 turns. While I work a lot with images, which could lead to spikes in individual dialogs, the memory consumption should not be increasing with time, but be rather constant or have (unpredictable) spikes. Some context:
This is my code |
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #4885 +/- ##
==========================================
+ Coverage 68.21% 68.44% +0.23%
==========================================
Files 158 160 +2
Lines 9960 10060 +100
==========================================
+ Hits 6794 6886 +92
- Misses 3166 3174 +8
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
@Leon0402 Can you show where your runtime is created? this might be due to the runtime is not removing references to created agents. To mitigate you might want to create new instances of runtime for each task. I think we should handle it in a separate PR. |
Haven't seen this in the main branch myself, would be curious to know what going on. I agree with Eric, sounds plausible |
Why are these changes needed?
The new v4 version has no support for jupter notebooks yet compared to the old version.
NOTE: This is an alternative implementation to #4795 which uses the
nbclient
library rather than implementing the protocol ourself. Whilenbclient
is probably not made for that use case, it works quite well with only small hacks. Additionally no explicit external jupyter gateway server is used anymore (which could be removed in the other implementation as well though).Compared to the other implementation it is much simpler and more importantly much more robust. I extensively tested both implementations in a real life scenario and the other one had problems with getting frequent timeouts and got stuck after running for a while. I debugged it quite a bit, but I had the feeling that providing a robust implementation is quite complicated and better left to experts in that domain (i.e. nblient).
Related issue number
Closes #4792
Checks