-
Notifications
You must be signed in to change notification settings - Fork 37
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: Resize file in screen toolkit #81
Conversation
@zakiali any thought on a ride for this PR: make it screenshot all screens? |
0d5fa21
to
1804f72
Compare
Added options to take screenshots of diff screens instead of the main and it's a pretty easy extension to do all screens (pass in more file paths to the main command). I left it out though because it might be too much and maybe an extended toolkit that does that for power use cases? |
the parameter thing may be different, but I also saw this (thought I could try 2 screens): G❯ can you take a look at both my screens and tell me what you see
screencapture -x -D 1 /tmp/goose_screenshot_4cc6811760b04cc7984317fef33c02b2.png
screencapture -x -D 2 /tmp/goose_screenshot_aa42b9aabaf94fbb814259f669c2a5c8.png
Traceback (most recent call last):
File "/Users/micn/Documents/code/domestic-goose/.venv/lib/python3.12/site-packages/exchange/providers/utils.py", line 30, in raise_for_status
response.raise_for_status()
File "/Users/micn/Documents/code/domestic-goose/.venv/lib/python3.12/site-packages/httpx/_models.py", line 763, in raise_for_status
raise HTTPStatusError(message, request=request, response=self)
httpx.HTTPStatusError: Client error '400 Bad Request' for url 'https://api.openai.com/v1/chat/completions'
For more information check: https://developer.mozilla.org/en-US/docs/Web/HTTP/Status/400 it may be stuffing 2 images at once to openai - if you like I can take a look and try to push some fixes to this branch? |
src/goose/toolkit/screen.py
Outdated
""" | ||
Take a screenshot to assist the user in debugging or designing an app. They may need help with moving a screen element, or interacting in some way where you could do with seeing the screen. | ||
Optionally, resize the image using the `sips` cli tool. You may do this fs there is potential for error in the payload size. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think at the moment, the error around the payload size is not surfaced to the model but instead exits the loop? we'll need an update in exchange i think for it to see the need to do this.
We might instead consider always resizing to 728x on the low side, given the openai approach. We can also switch from png to jpg.
detail: high images are first scaled to fit within a 2048 x 2048 square, maintaining their aspect ratio. Then, they are scaled such that the shortest side of the image is 768px long.
That'd make for maybe 200kb uploads, so always well below the limit without needing the model to consider when it needs to resize
230ec67
to
77b2bf2
Compare
Co-authored-by: Bradley Axen <[email protected]>
Updating the screen toolkit to have the capability to resize the image.
This is a solution for errors in payload size when sending images, for example. We've been seeing this in some providers, but not all. The limits in OpenaAI for payload size is 20MB.
By tools default functionality doesn't change, only when prompted to resize the image if you run into errors.
edit: Additionally, wanted to note that the file size isn't the right number for the payload since the images are being base64 encoded, inflating the disk size by 37%.