-
Notifications
You must be signed in to change notification settings - Fork 4.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: log retriever endpoint #2601
Conversation
Pull Request Validation ReportThis comment is automatically generated by Conventional PR Whitelist Report
Result Pull request does not satisfy any enabled whitelist criteria. Pull request will be validated. Validation Report
Result Pull request satisfies all enabled pull request rules. Last Modified at 09 Jul 24 04:02 UTC |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this work.
My opinion is that the common use case is to follow the logs in "real time". The endpoint is not designed for that and I find it very hard to implement on client side and also not efficient. Pagination for real time logs isn't a common pattern.
Polling the server again and again results in adding overhead.
I think we plan to add this feature to the UI so it has to be lightweight as possible.
My suggestion is that we implement the endpoint as unbounded stream of text.
The client could decide to start from the beginning or tail to the end (getting only new logs).
Plus, in this way we get rid of:
- pagination handling (which is expensive in terms of cpu/mem - you're doing a lot of memory allocations)
- sessions management
Additionally, I think we should stream to file to not impact memory usage at all and make all the logs available for the client. (currently is limited to the buffer size)
return session_id | ||
|
||
def get_page(self, session_id: str, page: int) -> Dict[str, Any]: | ||
self.cleanup_sessions() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we need to setup a timer to run every x time to check for sessions to cleanup. get_page could not be called for a while
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
session and pagination is removed
if session_id not in self.sessions: | ||
return {"error": "Invalid or expired session ID"} | ||
|
||
session = self.sessions[session_id] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we need a lock for accessing the sessions
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed session
total_logs = len(logs) | ||
total_pages = ceil(total_logs / self.page_size) | ||
|
||
if page < 1 or page > total_pages: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does page start from 1 or 0 ?
if the page is not valid, we should return error
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pagination removed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
* log retriever endpoint * disabled by default * realtime log stream via http/2 SSE * read and write lock * unit test (cherry picked from commit 81849d5)
Implemented a feature to allow all LangFlow logs to be retrieved through an API
/logs
/logs
endpoint with pagination support/logs
endpoint to support the last number linesLimitation, this endpoint does not filter out logs based on flow, nor user, therefore it's disabled by default. Once it's enabled the logs can be retrieved by any user.