Skip to content

Commit

Permalink
Attempt to explain better why delImplIdx is the way it is. Maybe this…
Browse files Browse the repository at this point in the history
… can (nim-lang#15108)

avoid future implementation mischief.  (Maybe not.  Sometimes, general
distrust of theory leads people to distrust simple reasoning over times
from CPUs trying as hard as possible to mask DRAM latency via pre-fetch.)
  • Loading branch information
c-blake authored and mildred committed Jan 11, 2021
1 parent 899a7b6 commit 575a79b
Showing 1 changed file with 30 additions and 0 deletions.
30 changes: 30 additions & 0 deletions lib/pure/collections/tableimpl.nim
Original file line number Diff line number Diff line change
Expand Up @@ -78,6 +78,36 @@ template hasKeyOrPutImpl(enlarge) {.dirty.} =
maybeRehashPutImpl(enlarge)
else: result = true

# delImplIdx is KnuthV3 Algo6.4R adapted to i=i+1 (from i=i-1) which has come to
# be called "back shift delete". It shifts elements in the collision cluster of
# a victim backward to make things as-if the victim were never inserted in the
# first place. This is desirable to keep things "ageless" after many deletes.
# It is trickier than you might guess since initial probe (aka "home") locations
# of keys in a cluster may collide and since table addresses wrap around.
#
# A before-after diagram might look like ('.' means empty):
# slot: 0 1 2 3 4 5 6 7
# before(1)
# hash1: 6 7 . 3 . 5 5 6 ; Really hash() and msk
# data1: E F . A . B C D ; About to delete C @index 6
# after(2)
# hash2: 7 . . 3 . 5 6 6 ; Really hash() and msk
# data2: F . . A . B D E ; After deletion of C
#
# This lowers total search depth over the whole table from 1+1+2+2+2+2=10 to 7.
# Had the victim been B@5, C would need back shifting to slot 5. Total depth is
# always lowered by at least 1, e.g. victim A@3. This is all quite fast when
# empty slots are frequent (also needed to keep insert/miss searches fast) and
# hash() is either fast or avoided (via `.hcode`). It need not compare keys.
#
# delImplIdx realizes the above transformation, but only works for dense Linear
# Probing, nextTry(h)=h+1. This is not an important limitation since that's the
# fastest sequence on any CPU made since the 1980s. { Performance analysis often
# overweights "key cmp" neglecting cache behavior, giving bad ideas how big/slow
# tables behave (when perf matters most!). Comparing hcode first means usually
# only 1 key cmp is needed for *any* seq. Timing only predictable activity,
# small tables, and/or integer keys often perpetuates such bad ideas. }

template delImplIdx(t, i, makeEmpty, cellEmpty, cellHash) =
let msk = maxHash(t)
if i >= 0:
Expand Down

0 comments on commit 575a79b

Please sign in to comment.