std/hashes: hash(ref|ptr|pointer) + other improvements #17731

timotheecour · 2021-04-15T20:38:22Z

hash(ref) now supported
refs https://forum.nim-lang.org/t/7765#49360

Yeah, I agree. It's a strange omission that ref doesn't have a default hash implementation

changes a behavior that was introduced all the way back in 405b860

function hashPtr(p: Pointer): THash;
begin
  result := ({@cast}THash(p)) shr 3; // skip the alignment
end;

and doesn't make sense to me (1 of the tests in this PR would fail with this rule)

make hash(pointer) use the same scrambling as for hash(int) (ie, honors nimIntHash1) so that code expecting scrambled hashes works with pointer (and ref|ptr) too, without performance drop, ie treats all those types that can be cast to int in an identical way; note that I still consider hashing collision: improve performance using linear/pseudorandom probing mix #13440 the superior, more performant alternative (see benchmarks) but that's a separate issue and can be addressed in future work when i revisit it)
improve hash(closure); it was using hash(rawProc(x)) !& hash(rawEnv(x)) which violates hash API's by not finalizing it with !$ (ie, can give bad distributions); instead I'm now using hash((rawProc(x), rawEnv(x))) which "does the right thing"
workaround for the issue mentioned in add std/syntaxes to hint syntax highlighters about string literals #17722 (comment) that makes the rendering of std/hashes terrible because of asm """p=Data;"""
test ttables.nim in js + cpp (in addition to c)
refactor disableVm

future work

support VM for hash(ref|ptr|pointer)

arnetheduck · 2021-04-19T20:26:10Z

how does this not break all code that defines a hash(ref T) on their own?

timotheecour · 2021-04-19T20:38:41Z

important_packages was green, and you can overload hash(ref T) in your code as shown in runnableExamples in this PR.

Note that any change (including pure bugfixes) is a potential breaking change for someone under some circumstance, but this shouldn't be a reason to freeze compiler/stdlib improvements.

In future work (pending #12076 + another PR), you'll be able to selectively disable an overload

arnetheduck · 2021-04-19T20:54:30Z

important_packages was green, and you can overload hash(ref T) in your code as shown in runnableExamples in this PR.

this is hardly a signal, ie the nim language has many nooks and crannies in which bugs hide and symbol resolution is an unusually common one because the lookup rules - we've been here before, all too often - in the best case, this breaks something at compile time, in the worst case, it will break access to hash tables depending on which imports have been made and the mutual compatibility of the std lib declared hash function and the user-declared one.

potential breaking change

this is why instead of taking a flippant approach to breaking user code, it's worth asking oneself about the mechanics of how a code change in such a central part as hash tables might break due to a change or why it will not break, and if it potentially will break things, notify users how they should work around the issue - it's also fine if you don't know or understand the language well enough to even determine these mechanics, but in this case it's even more important to ask the question before pulling the trigger. We have a large codebase that depends on these features and finding out about breaking changes after the fact has been a real problem for numerous releases and bugfix releases. If this were a fringe library, it wouldn't matter, but it isn't, really.

selectively disable

this does not help previously working code - it also doesn't help footguns such as slightly incompatible definitions of hash being used in different modules do to a lack of imports, or worse, the order of imports, both of which have presented problems in the past.

Araq · 2021-04-20T05:26:35Z

this is hardly a signal, ie the nim language has many nooks and crannies in which bugs hide and symbol resolution is an unusually common one because the lookup rules

Yeah, it's the worst part of Nim IMHO. Time to bring up an RFC and attack the problem. We need concepts and we need to attach procs to types.

Araq · 2021-04-21T13:32:28Z

@timotheecour since the RFC that would sort out these problems has not even been written yet (sorry), this extension must be opt-in first.

timotheecour force-pushed the pr_hashes_ref branch 2 times, most recently from e1f99c7 to 9a715dc Compare April 15, 2021 20:56

timotheecour mentioned this pull request Apr 15, 2021

some typeclasses don't work with typed, macros.symBodyHash crashes with ptr, ref #17733

Closed

timotheecour added 3 commits April 15, 2021 17:04

std/hashes: hash(ref|ptr)

bd8b5cc

add tests

6357daa

fix tests

54b343c

timotheecour force-pushed the pr_hashes_ref branch from 9a715dc to 54b343c Compare April 16, 2021 00:07

improve tests

77becd7

timotheecour added the TODO: followup needed remove tag once fixed or tracked elsewhere label Apr 16, 2021

fix tests

39db6a3

timotheecour marked this pull request as ready for review April 16, 2021 07:45

Araq merged commit d19e431 into nim-lang:devel Apr 16, 2021

timotheecour deleted the pr_hashes_ref branch April 16, 2021 17:00

timotheecour mentioned this pull request Apr 19, 2021

changelog: document hash changes #17792

Merged

timotheecour mentioned this pull request Apr 26, 2021

add -d:nimLegacyNoHashRef for a transition period which avoids defining hash(ref) #17858

Merged

timotheecour mentioned this pull request May 25, 2021

hashes for refs should be an opt-in feature #18098

Merged

timotheecour mentioned this pull request Jul 10, 2021

WIP timotheecour/Nim#781

Closed

PMunch pushed a commit to PMunch/Nim that referenced this pull request Mar 28, 2022

std/hashes: hash(ref|ptr|pointer) + other improvements (nim-lang#17731)

c44d564

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

std/hashes: hash(ref|ptr|pointer) + other improvements #17731

std/hashes: hash(ref|ptr|pointer) + other improvements #17731

timotheecour commented Apr 15, 2021 •

edited

Loading

arnetheduck commented Apr 19, 2021

timotheecour commented Apr 19, 2021 •

edited

Loading

arnetheduck commented Apr 19, 2021

Araq commented Apr 20, 2021

Araq commented Apr 21, 2021

std/hashes: hash(ref|ptr|pointer) + other improvements #17731

std/hashes: hash(ref|ptr|pointer) + other improvements #17731

Conversation

timotheecour commented Apr 15, 2021 • edited Loading

future work

arnetheduck commented Apr 19, 2021

timotheecour commented Apr 19, 2021 • edited Loading

arnetheduck commented Apr 19, 2021

Araq commented Apr 20, 2021

Araq commented Apr 21, 2021

timotheecour commented Apr 15, 2021 •

edited

Loading

timotheecour commented Apr 19, 2021 •

edited

Loading