-
-
Notifications
You must be signed in to change notification settings - Fork 2.1k
Bulk-invalidate e2e cached queries after claiming keys #16613
Conversation
Missed in #7436
The order of the keys much match the order of each values iterable, but collections have no ordering guarantees.
gold star please
for keys in key_tuples: | ||
txn.call_after(cache_func.invalidate, keys) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@DMRobertson postulated if it would be better to do:
for keys in key_tuples: | |
txn.call_after(cache_func.invalidate, keys) | |
def invalidate(): | |
for keys in key_tuples: | |
cache_func.invalidate(keys) | |
txn.call_after(invalidate) |
But I don't think it matters much since txn_call_after
just calls them in a loop anyway? Maybe one way would use less memory. 🤷
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we update the non-bulk versions of these to actually require tuples?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was wondering if we should actually be requiring List[str], since the keys
column is []text
. It'd save a conversion tuple->list here and there? 🤷
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think tuples are used though because they're immutable so is what is used in the cache keys?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ahh, that makes sense!
I'd be in favour of specifying Tuple[str, ...]
, but in another PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do they have to be strings? I'd be kind of surprised if we don't have other types in there?
Anyway, yes sounds like a different PR.
# TODO Can we call this for just the last position or somehow batch | ||
# _add_persisted_position. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is a question for @erikjohnston -- _add_persisted_position
seems fairly heavy and if we could optimize what we call it with (or take multiple IDs at once) I think that'd be a win.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we certainly could make make it take multiple IDs? It does expect to be called with every position (though I don't think anything would break, just be less efficient)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess if it has to be called w/ every position maybe it isn't a huge deal...
Co-authored-by: Patrick Cloke <[email protected]>
@@ -671,14 +671,50 @@ def get_next_txn(self, txn: LoggingTransaction) -> int: | |||
|
|||
return self._return_factor * next_id | |||
|
|||
def _mark_id_as_finished(self, next_id: int) -> None: | |||
"""The ID has finished being processed so we should advance the | |||
def get_next_mult_txn(self, txn: LoggingTransaction, n: int) -> List[int]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if get_next_txn
should call this instead of duplicating some of the logic? (Same kind of goes for the cache & stream functions above TBH)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks sane from my side, without having looked too closely
As threatened in #16554 (comment)
History:
/keys/claim
is surprisingly slow #16554I wrote a dirty test for the new bulk invalidation that I'm not particularly happy with. Completely untested otherwise.
Nominating Erik because a) streams and b) PC helped me write this.