-
-
Notifications
You must be signed in to change notification settings - Fork 183
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Async code in aiobotocore calls synchronous functions of boto3 #940
Comments
They do recommend keeping a single persistent session for long-running processes like servers. We did this a month ago and saw slightly better performance. Just my 2 cents |
I'm not following, you want to make the |
@dacevedo12 - thanks for the feedback !!! i totally agree with you and we're working to update our system to use single Session object, I have a feeling that the problem is not with the Session but rather with creating the resource so i'm not sure that having long-lived session will solve the problem. @thehesiod - that exactly what i initially thought, however it bothers me a bit that a nonbocking (async) function actually calls a blocking function and it blocks the entire ioloop. I'm just not sure if this is indeed the intended/designed behavior . I also admit that i tried to fix it but the code is complex , it's even hard for me to understand what's going on... |
@zferentz that's totally normal, you're not going to have all your code async, only that which makes blocking calls, or eventually makes blocking calls. |
@thehesiod that's my point - i think that the creating the session (or maybe it's the resource) makes a blocking call to boto3 which leads to a blocking I/O call (which shouldn't happen in an async call). Either way, appreciate your effort and feedback !! |
@zferentz I don't see any blocking io calls, can you point that out? |
For example, i see that the call The top level call is async however all the other calls are non async and since we call os.path - we are basically calling a blocking function that access the filesystem which is a blocking call. Now, since the |
@zferentz ya but there are no python built-in async functions for file io, we'd have to swap to using aiofiles or manually via threading which is non ideal. There really needs to be built-in python async file APIs for that to work as intended, otherwise you're spinning up a bunch of heavy-weight, GIL'd python threads. Further most people should be on cached SSDs which should make this rather trivial. If you're blocked on listdir, should inspect why the directory its listing has so many files. |
@thehesiod Thank you for the immediate feedback. First, I'm happy that i was not completely wrong and we do have an async code that calls blocking I/O functions. I wasnt 100% sure and i am happy that you see it as well. Now, indeed Python doesnt provide a builtin mechanism for async filesystem operations but i wonder if we can optimize it by caching this data (as we can assume that the list of libraries/modules wont change throughout the lifetime of the application). Either way,i do realize that there's no easy solution here. |
@zferentz it is cached, at the session/client level :), and people really shouldn't be re-creating sessions/clients often as there's no need. In terms of async code calling blocking io yes and no. file IO really should not be a problem on modern systems unless your filesystem is on a network, in which case probably caching should be controlled at the driver level, or, the code is doing something dumb (like I'd imagine that listfiles probably should be a glob with a filepattern instead, and if there are a bagillion files in the dir something is probably badly designed for that folder). Also caching is out of scope for aiobotocore as it's just async'ifying botocore. Caching belongs in the botocore project. Also globals=bad in general. Are you bringing this up due to a real world problem or theoretical? Without a real world scenario I can't give a good recommendation. |
@thehesiod Thanks again for your quick reply !! For example, in my example - one of our older libraries is creating many resources/sessions and for each one the loader more than 200 blocking system calls to the filesystem (which is indeed local, standard python ). The net result (on my system) is that the entire ioloop is blocked and i can see how other async operations (standard aiothttp calls) are waiting for the session-creation. I guess if no one else raised this topic then it means that everyone indeed using single session for the entire app (which i hope to do in the future) or just people are not aware of this issue. Either way, if you believe that there's no easy way to fix it - feel free to close this ticket . thanks a lot. |
@zferentz tbh I'm glad you reported this as we were also seeing some performance shenanigans but weren't really sure whether event loop skews due to blocking calls played a part in the problem. For instance, I sometimes see aiobotocore+aiohttp requests to dynamodb taking seconds (for context, that's outrageously slow for non-scan dynamodb operations), whereas cloudwatch always reports times under 100ms. We're currently keeping a global long-lived session but still instancing the resource on every operation |
FYI to track slow sync calls you can use something like this: https://gist.github.com/thehesiod/0d2beed27c800e4f7286be6c3cd17a09 That's what we have in production to ensure sync calls aren't blocking the main thread for too long |
@zferentz I think it would be great to speed up botocore, actually I proposed some changes to how they parse timestamp years ago that sped up our scenario 20% and they have yet to merge my change, so unfortunately getting them to change things can be glacial. However I suggest the following, given |
and in either case can use your monkeypatch to speed up your use case :). Actually if this really does take a significant amount of time I'd like to use this monkeypatch at our company as well, or perhaps we could ship an optional set of speedups in aiobotocore. Please do report your test and findings here! |
ok going to close as this isn't really an issue with aiobotocore as 1) the performance issue is really an underlying botocore issue and 2) unfortunately we don't have batteries included async file support yet :( feel free to re-open if I missed something. |
Describe the bug
While running some performance test (simple loop, code attached below) we believe that we found a problem where creating a session/resource in aioboto3 is calling an async function in aiobotocore which is calling a blocking function in boto3.
Code
The following code demonstrates the problem - the more sessions/resources we create - the more time it takes which shows that it's not real async.
pip results
Environment:
Additional info/context
We opened an issue with aioboto3 (terricain/aioboto3#254) but the assignee was right to ask us to bring it up here.
We believe that the block is happening because aiobotocore calls synchronous functions of boto3.
As an example, think of the coroutine named _create_client (implemented in aiobotocoro/session) which calls
self._get_internal_component('endpoint_resolver')
.which eventually hits the get_component method in botocore.session which
executes the deffered (sync) function as can be seen here:
We wonder if it's possible to assure the asynchronicity of both the Session and the resource creation to eliminate the blocking.
Our backend code creates many sessions/resources and we're trying to optimize its performance.
Thank you in advance !!
The text was updated successfully, but these errors were encountered: