-
-
Notifications
You must be signed in to change notification settings - Fork 77
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Race condition when using [] operator on map when accessing a key that is known to already exist #98
Comments
Thanks for reporting, I'll see what I can do. |
I did some testing and came up with a solution that is working. Here is the modified function (I'm posting it here rather than a pull request, because the recursion isn't necessarily ideal. However, during a million insertions, the recursive call only happens about a dozen times due to the rate at which the map fills up after resizing, so it's a virtually negligible difference...).
Notes: The first call to increase size at the top of the function is now conditional on the max capacity being zero, rather than is_full being true. This ensures our map is properly initialized if empty, but will never be called if the [] operator is used and the key already exists (it's impossible for a key to exist and the map to be empty). So that prevents the initial race condition. And finally, when (and only when) we determine that the key doesn't exist, that's when we check if the map is full and increase its size if necessary - finally returning with a recursive call to the function in order to get proper bucket_idx/dist_and_fingerprint values after the allocation in order to insert the key. I tested pretty extensively in my own codebase and these two changes seem to have done the trick. |
Thanks, I came up with very similar code - if possible I'd like to get rid of the initial |
|
@ktprime No, at() calls find again internally, so that's worse than just using the iterator from find() (which I mentioned in my original post):
|
@proc-sim can you try if this version works for you? https://raw.githubusercontent.com/martinus/unordered_dense/2023-12-dont-resize-in-emplace-when-not-necessary/include/ankerl/unordered_dense.h |
@martinus Tested your update in my codebase - works great, no issues detected. Thanks for jumping on this so quick! |
I've already created a new release 4.2.0 with this new version. |
Summary:
The [] operator's invocation of do_try_emplace makes a potentially unnecessary call to increase_size if a map is full, even if the requested key already exists in the map, leading to race conditions when maps are accessed by multiple threads.
Info:
I'm doing a drop-in replacement of std::unordered_map with ankerl::unordered_dense and have been running into lots of crashes in my codebase, which can be narrowed down to the following code pattern, called from multiple threads on the same map:
This is safe code when using std::unordered_map. However, after examining ankerl::unordered_dense::map, the [] operator invokes do_try_emplace, which will attempt to increase the size of a map if the map is full (with a call to increase_size, given the is_full condition), prior to any checks to see if the key in question already exists. This leads to a race condition if multiple threads are using the [] operator on a map that is full - even if all keys requested have already been inserted into the map (meaning there's no reason the map should require reallocations, in principle).
I recognize using the iterator returned by find to access the value would be a solution, but it still seems like the problem is ultimately a bug. The existence of this problem means the [] operator can never be used in a thread-safe manner unless the map is guaranteed not to be full - even if no new keys are being inserted.
The text was updated successfully, but these errors were encountered: