Strash index was always calculated at 0, leading to 100% collisions#17
Strash index was always calculated at 0, leading to 100% collisions#17ecordell wants to merge 2 commits intogo-air:masterfrom
Conversation
offset by 1 avoids the problem and distributes keys nicely
|
Eric, thanks much for your PR. Indeed the hash was colliding. However, the sudoku test isn't really affected, nor is any solving inside this project (or it would have been noticed earlier). logic.C is used for creating formulas before sending them to the solver. Since the solving time dominates on interesting problems, this went unnoticed. But, a small variation on your change: gives a huge speedup on constructing logic formulas. I added this bench: to test for it and got this: Would you like to submit something along these lines? |
|
(I think this will help https://github.com/go-air/reach a lot) |
scott-cotton
left a comment
There was a problem hiding this comment.
Upon reflection, the lits are 32bit so overflow happens a lot on bigger problems, and where there is overflow the hash is distributed. But a huge improvement for the smaller problems.
Reviewable status: 0 of 2 files reviewed, all discussions resolved
scott-cotton
left a comment
There was a problem hiding this comment.
Reviewable status: 0 of 2 files reviewed, 2 unresolved discussions (waiting on @ecordell)
sudoku_test.go, line 13 at r1 (raw file):
Quoted 5 lines of code…
for i := 0; i < b.N; i++ { Example_sudoku() } }
Please see alternate test in the discussion
logic/c.go, line 379 at r1 (raw file):
func strashCode(a, b z.Lit) uint32 { return uint32((a << 13) * b)
I think
return uint32(^(a<<13) * b)
is preferable because the change is in one place, preserving the idea of the strashCode, and it uses all the capacity.
scott-cotton
left a comment
There was a problem hiding this comment.
well not quite "overflow", but more evenly distributed for larger problems.
Reviewable status: 0 of 2 files reviewed, 2 unresolved discussions (waiting on @ecordell)
Thanks to Evan Cordell who spotted a problem with the strash code colliding. Interesting, as it has a strong impact on the logic/c performance (but none on solving). See #17
|
Evan, Thanks a lot for the PR. In the interest of time, I went ahead and submitted the alternative in the review and cut a new release. (contributor agreement and not sure of your level of interest). Please feel free to follow up in any way. |
|
Thanks @scott-cotton! Your reasoning makes sense |
new version has logic.C performance improvements that should help see: - go-air/gini#17 - go-air/gini@3a1a4d9 - go-air/gini@8dd6805 - go-air/gini#18
new version has logic.C performance improvements that should help see: - go-air/gini#17 - go-air/gini@3a1a4d9 - go-air/gini@8dd6805 - go-air/gini#18
new version has logic.C performance improvements that should help see: - go-air/gini#17 - go-air/gini@3a1a4d9 - go-air/gini@8dd6805 - go-air/gini#18 Signed-off-by: Evan <cordell.evan@gmail.com>
new version has logic.C performance improvements that should help see: - go-air/gini#17 - go-air/gini@3a1a4d9 - go-air/gini@8dd6805 - go-air/gini#18 Signed-off-by: Evan <cordell.evan@gmail.com>
new version has logic.C performance improvements that should help see: - go-air/gini#17 - go-air/gini@3a1a4d9 - go-air/gini@8dd6805 - go-air/gini#18 Signed-off-by: Evan <cordell.evan@gmail.com> Upstream-repository: operator-lifecycle-manager Upstream-commit: c20784d3e2a372c2a6a03dbcfedf512ca84b1eca
new version has logic.C performance improvements that should help see: - go-air/gini#17 - go-air/gini@3a1a4d9 - go-air/gini@8dd6805 - go-air/gini#18 Signed-off-by: Evan <cordell.evan@gmail.com> Upstream-repository: operator-lifecycle-manager Upstream-commit: c20784d3e2a372c2a6a03dbcfedf512ca84b1eca
new version has logic.C performance improvements that should help see: - go-air/gini#17 - go-air/gini@3a1a4d9 - go-air/gini@8dd6805 - go-air/gini#18 Signed-off-by: Evan <cordell.evan@gmail.com> Upstream-repository: operator-lifecycle-manager Upstream-commit: c20784d3e2a372c2a6a03dbcfedf512ca84b1eca
new version has logic.C performance improvements that should help see: - go-air/gini#17 - go-air/gini@3a1a4d9 - go-air/gini@8dd6805 - go-air/gini#18 Signed-off-by: Evan <cordell.evan@gmail.com> Upstream-repository: operator-lifecycle-manager Upstream-commit: c20784d3e2a372c2a6a03dbcfedf512ca84b1eca
new version has logic.C performance improvements that should help see: - go-air/gini#17 - go-air/gini@3a1a4d9 - go-air/gini@8dd6805 - go-air/gini#18 Signed-off-by: Evan <cordell.evan@gmail.com> Upstream-repository: operator-lifecycle-manager Upstream-commit: c20784d3e2a372c2a6a03dbcfedf512ca84b1eca
new version has logic.C performance improvements that should help see: - go-air/gini#17 - go-air/gini@3a1a4d9 - go-air/gini@8dd6805 - go-air/gini#18 Signed-off-by: Evan <cordell.evan@gmail.com> Upstream-repository: operator-lifecycle-manager Upstream-commit: c20784d3e2a372c2a6a03dbcfedf512ca84b1eca
new version has logic.C performance improvements that should help see: - go-air/gini#17 - go-air/gini@3a1a4d9 - go-air/gini@8dd6805 - go-air/gini#18 Signed-off-by: Evan <cordell.evan@gmail.com> Upstream-repository: operator-lifecycle-manager Upstream-commit: c20784d3e2a372c2a6a03dbcfedf512ca84b1eca
new version has logic.C performance improvements that should help see: - go-air/gini#17 - go-air/gini@3a1a4d9 - go-air/gini@8dd6805 - go-air/gini#18 Signed-off-by: Evan <cordell.evan@gmail.com> Upstream-repository: operator-lifecycle-manager Upstream-commit: c20784d3e2a372c2a6a03dbcfedf512ca84b1eca
new version has logic.C performance improvements that should help see: - go-air/gini#17 - go-air/gini@3a1a4d9 - go-air/gini@8dd6805 - go-air/gini#18 Signed-off-by: Evan <cordell.evan@gmail.com> Upstream-repository: operator-lifecycle-manager Upstream-commit: c20784d3e2a372c2a6a03dbcfedf512ca84b1eca
new version has logic.C performance improvements that should help see: - go-air/gini#17 - go-air/gini@3a1a4d9 - go-air/gini@8dd6805 - go-air/gini#18 Signed-off-by: Evan <cordell.evan@gmail.com> Upstream-repository: operator-lifecycle-manager Upstream-commit: c20784d3e2a372c2a6a03dbcfedf512ca84b1eca
new version has logic.C performance improvements that should help see: - go-air/gini#17 - go-air/gini@3a1a4d9 - go-air/gini@8dd6805 - go-air/gini#18 Signed-off-by: Evan <cordell.evan@gmail.com> Upstream-repository: operator-lifecycle-manager Upstream-commit: c20784d3e2a372c2a6a03dbcfedf512ca84b1eca
This is how the structural hash in gini is computed:
gini/logic/c.go
Line 379 in a1a5030
But when it’s used, it’s used “mod capacity” of the nodes - which are always multiples of 2. The index into the strash array is always
a*2^13*b % 2^nwhich is 0 for n<13:gini/logic/c.go
Lines 268 to 271 in a1a5030
gini/logic/c.go
Lines 285 to 287 in a1a5030
and leads to 100% collisions on the table - it will always try to store in index 0 and have to walk back through the node links.
To fix this, I changed the index calculation to
strash % (cap - 1)I added a small benchmark that uses the existing sudoku example as a sanity check:
which is a ~15% speedup.
(I'm sure there are much better ways to check this; it seems like the
benchcommand would work)It's also entirely possible I'm not understanding what is supposed to be happening, please enlighten me if that's the case!
This change is