-
Notifications
You must be signed in to change notification settings - Fork 190
Fix incorrect query caching for recycled tracked structs #708
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix incorrect query caching for recycled tracked structs #708
Conversation
✅ Deploy Preview for salsa-rs canceled.
|
CodSpeed Performance ReportMerging #708 will not alter performanceComparing Summary
|
f3c6422 to
b3e77e5
Compare
b9e446f to
181902f
Compare
|
At a high level, one thing I'm not understanding from the PR description is why this only affects queries with multiple arguments and implicitly interned structs. If a reused tracked struct ID is the sole input to a query, why doesn't that cause the same problem? |
Results for single-argument queries are directly stored on the tracked struct's memo table. The recycling code already unsets all associated query results. Multi-argument queries are different because they use interning and the query then only depends on that interned value. The interned value's identity is whether the value compares and hashes equal, which is the case when ids are reused |
5e6d524 to
3ac27e9
Compare
|
I pulled the changes for the bad hash/hash collision down into this PR to make the next PR a memory improvement only. I hope that helps clarify when/how |
d5285d3 to
b338772
Compare
I applied the suggestion and extended the comment to also account for other reasons where created_at needs updating.
This PR fixes a bug specific to tracked structs that are recycled and the query doesn't read any tracked fields.
Salsa maintains a free-list with tracked structs that were created in a previous revision but are now no longer used. The storage and IDs of those tracked structs can be reused in the next revision. This helps reduce overall memory usage and allows the recycling of IDs.
However, the reuse of ids combined with coarse-grained dependencies as implemented today can lead to incorrect cache-reuse if a tracked struct ID gets reused and a query takes multiple arguments -- which requires interning. The problem is that the interned argument struct compares equal because the tracked struct has equal ids, making salsa believe that reusing the cached data is fine (when it isn't).
This PR fixes this by:
created_atfield to tracked structs that stores the revision when the tracked struct was first created. This field gets updated to the current revision when a tracked struct is recycledmaybe_changed_afterimplementation toTrackedStructthat returnstrueif the tracked struct was created after the query was last run. This suggests that this is no longer the same tracked structmaybe_changed_afterwhen validating if the query has changed.The new field is similar to the interned struct's
reset_atfield. It's also the same as we have for tracked fields on tracked structs where each field caries its own revision.I added a new test and verified that it failed on master before implementing this fix.
Note:
Why not use
updated_at?:updated_atgets bumped in every revision any field of the tracked struct was read. This is different from what we need here where it's mainly to identify when the id was reused.