new: typetraits.getTypeId #13305

timotheecour · 2020-01-31T11:41:55Z

new: typetraits.getTypeId: get a unique id (int) for a type (note: static, not dynamic), can be used for type equality, reliable type hashing (more reliable than via $T), + many other use cases that would be really hard without this

this is much simpler and more efficient than something like getTypeId from https://github.com/yglukhov/variant/blob/master/variant.nim which relies on stringification

cooldome · 2020-01-31T12:44:45Z

There is already signatureHash and symBodyHash in macros module to solve the same problem.

timotheecour · 2020-01-31T22:20:22Z

@cooldome

There is already signatureHash and symBodyHash in macros module to solve the same problem.

could you please elaborate, and show how signatureHash (or symBodyHash) can be used in place of getTypeid?
getTypeid returns a stable id (a simple int) indicating the type; not sure how I can do that with symBodyHash:

  import std/macros
  macro getTypeid2(T: typed): untyped =
    newLit signatureHash(T)

  template getTypeid3(T: typed): untyped =
    type T2 = T
    getTypeid2(T2)
  type MyInt = int
  echo getTypeid2(MyInt)
  echo getTypeid2(int)
  echo getTypeid2(int)
  # echo getTypeid2(type(1)) # Error: node is not a symbol
  echo getTypeid3(type(1))
  echo getTypeid3(type(1))
  echo getTypeid3(int)
  echo getTypeid3(int)

prints:

__9aprACdypoaazK8UMIsdo4Q
__GkpzIdbuOm8RKb9cCiLpthA
__GkpzIdbuOm8RKb9cCiLpthA
__3OdjBnrClJioDGW47qXstw
__6ZKf3oBRvMMUY1riPSCGcQ
__wO8y53sSYaadjVh1yHbxFg
__xOE39bc63vaZr1kKLr5If2g

furthermore, even if it did work with signatureHash, it's a lot more efficient as it simply returns what the compile already knows (the id field) instead of re-inventing a hash (which, alghough unlikely, could have collisions etc)

Araq · 2020-02-03T01:38:01Z

The type IDs are not stable though and will end up in some .compileTime cache if we expose them and this can produce terrible bugs once IC arrives. Whatever we expose must be as stable as possible. Git also exposes commit hashes, not simple int32 values. For good reasons.

timotheecour · 2020-02-03T04:46:37Z

I know, and I can improve the docs to make that clear, and maybe rename to unsafeGetTypeid or hide it behind an experimental flag. Note that there should still be a way to make sure compiler has stable id's even with IC anyways, eg:

We cache generic instantiations and need to ensure this caching works well with the incremental compilation feature.

Regardless, this feature solves real problems that are hard/impossible to solve without it.

Here are just 2 examples:

passing a callback to a proc that handles multiple types:

this defines a generic pretty that works with any type and can be customized by callers via ca callback function (simplified for brevity)

## prettys.nim:
  type Callback* = proc(result: var string, a: pointer, id: int): bool

  import std/typetraits

  proc pretty[T](result: var string, a: T, callback: Callback) =
    when T is object:
      result.add "("
      for k,v in fieldPairs(a):
        result.add $k & ": "
        pretty(result, v, callback)
        result.add ", "
      result.add ")"
    elif T is ref|ptr:
      if callback(result, cast[pointer](a), getTypeid(T)):
        discard
      elif a == nil:
        result.add "nil"
      else:
        pretty(result, a[], callback)
    else:
      result.add $a

  proc pretty*[T](a: T, callback: Callback): string = pretty(result, a, callback)

## main.nim:
  import prettys, std/typeid
  proc main()=
    type Foo = ref object
      x: int
    type Bar = object
      b1: Foo
      b2: string

    let f = Bar(b1: Foo(x: 12), b2: "abc")

    proc callback(ret: var string, a: pointer, id: int): bool =
      case id
      of Foo.getTypeid:
        ret.add $("custom:", cast[Foo](a).x)
        return true
      else:
        discard

    echo pretty(f, callback)

    proc callback2(ret: var string, a: pointer, id: int): bool =
      case id
      of Foo.getTypeid:
        ret.add $("custom2:", cast[Foo](a).x)
        return true
      else:
        discard

    echo pretty(f, callback2)

  main()

prints:

(b1: ("custom:", 12), b2: abc, )
(b1: ("custom2:", 12), b2: abc, )

I'm using getTypeId in my compiler patch to pretty-print (or generate html with dynamically expandable nodes) for things like PNode, PSym etc, while applying custom rules depending on the type, eg call typeToString for PType, so that it avoids showing things I don't care about, or that would cause infinite recursion, etc.

exportc callback that handles multiple types

# module a:
import ./types
use callbackFun(...)

# module types:
define types ...
proc callbackFun(a: pointer, id: int) {.importc.}
import std/typetraits
proc callbackFun*[T](a: T) = callbackFun(cast[pointer](a), getTypeid(T))

implementation of callbackFun: say, this carries a lot of dependencies that we don't want to add to all modules that use callbackFun, because it would cause cyclic imports etc

# module impl:
proc callbackFun(a: pointer, id: int) {.exportc.} =
  case id
  of getTypeid(Foo1): echo cast[Foo1](a)
  of getTypeid(Foo2): echo cast[Foo2](a)
  else: doAssert false, $id

I'm using this to customize the compiler using callbacks and it requires with very little changes to compiler sources, unlike other solutions which would require a lot of refactorings for this to work.

Varriount · 2020-02-03T09:50:33Z

Wouldn't a more natural implementation use typedescs and compare them?

timotheecour · 2020-02-03T18:59:16Z

Wouldn't a more natural implementation use typedescs and compare them?

I don't see how this could work, see the 2 examples I gave in #13305 (comment) (nor does it help with type hashing); you'd be forced to use a generic in order to use typedescs, which can't work in those use cases. If you disagree, please post an answer with more details (or working code, ideally)

Araq · 2020-02-04T10:46:49Z

Your use cases do not depend on the fact that a type ID is an int. I'm arguing it should be a string.

Varriount · 2020-02-05T21:31:02Z

What about making type descriptions values at runtime, as well as compile time? Transparently represent as a pointer (or string, or int or whatever)? After all, that's what getTypeID is doing anyway - retrieving a representation of a typedesc that can be used at runtime.

timotheecour · 2020-02-05T22:33:05Z

What about making type descriptions values at runtime

I don't understand; if it's available at compile time it's available at runtime too, both these works:

const c1 = getTypeid(type(12))
let c2 = getTypeid(type(12))
doAssert c1 == c2

Unless you'd talking about dynamic class id in the context of object inheritance, in which case that's out of scope for this PR.

Your use cases do not depend on the fact that a type ID is an int. I'm arguing it should be a string.

I've now changed it to string instead of int (simply by stringifying id); the id itself can be changed in future implementations, and is (obviously) not assumed stable across compiler releases

Araq · 2020-02-05T23:07:20Z

I've now changed it to string instead of int (simply by stringifying id);

What's wrong with using the type's hash value though?

timotheecour · 2020-02-05T23:35:12Z

What's wrong with using the type's hash value though?

more background on this: see http://cpptruths.blogspot.com/2018/11/non-colliding-efficient.html and https://stackoverflow.com/questions/40357924/c11-typeid-uniqueness; c++ has both type_info::hash_code and std::type_index (and others...); it's a complex issue but basically hash_code isn't good enough for guaranteed uniqueness:

type_info::hash_code. As hashes for two different types can collide, it is useless for determining type equality [...]

std::type_index or std::type_info are better for guaranteeing uniqueness; if we're also considering shared libraries maybe the picture is a bit different, but here i'm concerned about use cases given above and id fits well

is there an API for that? or at least a not-yet-exposed internal compiler API ? not aware of it
looking for this specifically:

type Foo = object
const id: string = typeHash(Foo)

the advantage of id is that the compiler already did that work for us, and in theory it should work:

  TType* {.acyclic.} = object of TIdObj # \
                              # types are identical iff they have the
                              # same id; there may be multiple copies of a type
                              # in memory!

note that's it's still "in theory", because that comment is a lie right now; which this PR over-comes to some extent (at least for all primitive types; not yet for things like tuple or seq); nothing that can't be fixed though IMO

Varriount · 2020-02-06T03:28:52Z

@timotheecour As far as I know, you currently cannot create a Table[typedesc, string] at runtime - you must either use getTypeId (this proposal) or get the name of the type.

What I'm trying to suggest is allowing the above by essentially calling getTypeId behind the scenes (or representing a typedesc as some other opaque value).

timotheecour · 2020-02-06T04:24:12Z

@Varriount that's because an instance of a typedesc is not a value (eg something with a fixed sizeof) so Table[typedesc, string] can't work.
However since #13064 you can do this:

import std/typetraits
type Foo = object
  x1: string
type t = (int, Foo)
var names = ["myint", "myfoo"]
echo ($t.get(0), names[0])

anyhow what you're describing is out of scope for this PR

Araq · 2020-02-06T10:06:50Z

Hash collisions are very rare and we can always change the implementation to compress(canonicalTypeRepresentationAsAString) later on. we can only do that though if the type ID is a string or a distinct string.

timotheecour · 2020-02-09T23:15:42Z

PTAL now that the pending PR got merged.

we can only do that though if the type ID is a string or a distinct string

it now is a string; implementation simply uses $id, but is un-specified and allowed to change in future PRs

compress(canonicalTypeRepresentationAsAString)

that doesn't seem robust, depending on what canonicalTypeRepresentationAsAString is; eg "Foo" could mean different types (bar.Foo or baz.Foo); furthermore it looks like a roundabout way to do it and less efficient than simply accessing id which the compiler already knows about

Araq · 2020-02-10T08:32:53Z

Well it has to be package.module.Foo but even that isn't correct since nowadays you can have multiple modules of the same name in the same package and nobody can ever accept useful restrictions. So now we need $relativePathWeHopeWeGotRight/Foo as the canonical name but that's a story for another day.

Varriount · 2020-02-10T10:00:48Z

Would it be better to make the string distinct?

timotheecour · 2020-02-10T20:00:19Z

done, PTAL

timotheecour · 2020-02-22T23:22:27Z

ping @Araq

Araq · 2020-02-24T10:30:16Z

it can, but that's really something that can be done independently of this PR for reasons already mentioned.

Yeah well, but the integer ID (even if exposed as a distinct string) is fragile, who will read its documentation and ensure they are not serialized? This might be one foundation for an entity-component-system. One where every recompile is a gamble -- a recipe for instabilities.

Araq · 2020-04-20T10:34:28Z

Can't we instead make signatureHash(getTypeImpl(t)) work? For me it's the same thing.

Varriount · 2020-04-20T19:21:16Z

Yeah well, but the integer ID (even if exposed as a distinct string) is fragile, who will read its documentation and ensure they are not serialized?

Yeah, that's one thing that would make me much more comfortable with this PR - disclaimers stating what API/ABI promises are being made with regards to the value being returned.

stale · 2021-04-20T23:06:34Z

This pull request has been automatically marked as stale because it has not had recent activity. If you think it is still a valid PR, please rebase it on the latest devel; otherwise it will be closed. Thank you for your contributions.

timotheecour · 2021-06-08T07:44:51Z

ping @Araq , I've fixed bitrot again. This PR is useful, solves real problems that are hard/impossible to do without this feature, as mention in description as well as illustrated in the tests and I'd like this feature to get merged.

I've made the return type a distinct string as was requested.

The doc comment in getTypeId clearly mentions that ids are not intended to be stable across re-compilation if sources change, at some point users just have to read the docs.

This feature is present in lots of languages, for good reasons; eg:

C++: std::type_index(typeid(A)), refs https://en.cppreference.com/w/cpp/types/type_index,
D: typeid(A), refs https://dlang.org/spec/expression.html#TypeidExpression
rust: fn type_id<T>() -> u64, refs https://doc.rust-lang.org/std/intrinsics/fn.type_id.html pub extern "rust-intrinsic" (and also https://doc.rust-lang.org/std/any/struct.TypeId.html#method.of)

and even these:

C#: typeof (returns a value of type Type)
java: getClass() (RT) and A.class (CT) https://docs.oracle.com/javase/tutorial/reflect/class/classNew.html
python3: type(str).__hash__(str)

signatureHash(getTypeImpl(t)) has its own issues (no uniqueness guarantee, however rare but more importantly it's not in sync with what the compiler uses to assess type equality, unlike this PR which uses id and is guaranteed to be in sync); The premise that it would offer stability or be superior because of it is questionable; comparing typeids across re-compilation after sources change seems error-prone no matter what (eg is the hash structural or nominal for objects/tuples; does it allow renames/reorderings; more crucially, is it consistent with compiler's notion of type equality).

In any case, if someone needs signatureHash(getTypeImpl(t)) later, it can always be added in future work.

disruptek · 2021-06-08T14:50:36Z

This makes pooling continuation allocators easier.

timotheecour · 2021-06-10T23:02:06Z

signatureHash(getTypeImpl(t)) work? For me it's the same thing.

@Araq this cannot work.

you'd have to solve the inverse problem of finding a hash that satisfies the equivalence relationship defined by sameTypeAux(x, y: PType, c: var TSameTypeClosure) (high likelyhood of being buggy and not representative of it, it's a complex inverse problem that'd have to duplicate a lot of logic)
you'd have to live with fact that collisions could happen (see Collisions in type_id rust-lang/rust#10389, type_id collisions in rust, unsolved problem since 2013)
you'd still have to rely on compiler id's eg for symbols (which are not stable across re-compilation + source changes), which defeats the whole purpose of ignoring type ids in the first place, e.g:

A the the 1st and 2nd block are structurally equivalent, so there's no way for a signature based hash to tell them apart unless you look at symbol id's

when true:
  import std/typetraits
  proc fn[T](a: T): auto =
    static: echo "compiling fn" # this will print twice, both A's are distinct types
  block:
    type A = object
    fn(A.default)
    echo A.getTypeId
  block:
    type A = object
    fn(A.default)
    echo A.getTypeId

with this PR, I get 2 distinct ids (754974726, 754974746), which is correct. With a signature based hash, you'd either get a collision or you'd have to look at the symbol id (defeating whole purpose of ignoring the typeid, as you'd still loose any guarantees wrt stability against unrelated changes after re-compilation).

just for kicks I tried to see what hashType(t) would give, it fails even for simple cases like this:

import std/typetraits
type Foo[T] = object
echo Foo[int].getTypeIdViaHashType # __SpGTocwyb9chaUYWzXQMGSg
echo Foo[float].getTypeIdViaHashType # __SpGTocwyb9chaUYWzXQMGSg

n0bra1n3r · 2021-10-14T05:11:08Z

This is useful for generating type IDs for an ECS framework for game engines, especially ones with hot code reloading. The best C++ solution I've seen seems quite hacky: https://skypjack.github.io/2020-03-14-ecs-baf-part-8/ (uses __PRETTY_FUNCTION__).

stale · 2022-10-14T06:22:06Z

This pull request has been automatically marked as stale because it has not had recent activity. If you think it is still a valid PR, please rebase it on the latest devel; otherwise it will be closed. Thank you for your contributions.

timotheecour changed the title ~~new: typetraits.getTypeId~~ [pending #13303] new: typetraits.getTypeId Jan 31, 2020

timotheecour mentioned this pull request Jan 31, 2020

variant.getTypeid clashes with stdlib PR for typetraits.getTypeid yglukhov/variant#5

Closed

timotheecour closed this Feb 3, 2020

timotheecour reopened this Feb 3, 2020

timotheecour closed this Feb 3, 2020

timotheecour reopened this Feb 3, 2020

timotheecour force-pushed the pr_typeid_getTypeId branch from aa18e1f to 7ce3925 Compare February 9, 2020 21:44

timotheecour changed the title ~~[pending #13303] new: typetraits.getTypeId~~ new: typetraits.getTypeId Feb 9, 2020

timotheecour marked this pull request as ready for review February 9, 2020 23:13

timotheecour mentioned this pull request Feb 11, 2020

followup for #389 yglukhov/nimx#390

Merged

timotheecour force-pushed the pr_typeid_getTypeId branch from 81b1027 to 3591954 Compare February 11, 2020 06:28

timotheecour reopened this Feb 24, 2020

timotheecour closed this Feb 24, 2020

timotheecour reopened this Feb 24, 2020

timotheecour mentioned this pull request Oct 14, 2020

explicit ID generation for easier IC #15559

Merged

stale bot added the stale Staled PR/issues; remove the label after fixing them label Apr 20, 2021

timotheecour force-pushed the pr_typeid_getTypeId branch from baaee47 to b2af1f1 Compare April 21, 2021 00:06

stale bot removed the stale Staled PR/issues; remove the label after fixing them label Apr 21, 2021

timotheecour added 8 commits June 7, 2021 21:01

* fix nim-lang#9855, fix nim-lang#9855, new: getTypeId, fix genericHead

7bdd1a7

int => string

ea456ca

fixup

68aab2a

fixup

69a23d3

use distinct string

0d58b06

add example; fix return type

4c11c3b

add detailed use case examples

9e70ecf

fixup

d3b0665

timotheecour force-pushed the pr_typeid_getTypeId branch from b2af1f1 to d3b0665 Compare June 8, 2021 04:02

update PR after rebase

36e5328

timotheecour added the Ready For Review (please take another look): ready for next review round label Jun 8, 2021

timotheecour mentioned this pull request Jun 11, 2021

std/prettyprints: pretty print any expression (including with cycles), customizable nim-lang/RFCs#385

Open

stale bot added the stale Staled PR/issues; remove the label after fixing them label Oct 14, 2022

stale bot closed this Nov 13, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

new: typetraits.getTypeId #13305

new: typetraits.getTypeId #13305

timotheecour commented Jan 31, 2020 •

edited

Loading

cooldome commented Jan 31, 2020 •

edited

Loading

timotheecour commented Jan 31, 2020 •

edited

Loading

Araq commented Feb 3, 2020

timotheecour commented Feb 3, 2020 •

edited

Loading

Varriount commented Feb 3, 2020

timotheecour commented Feb 3, 2020 •

edited

Loading

Araq commented Feb 4, 2020

Varriount commented Feb 5, 2020 •

edited

Loading

timotheecour commented Feb 5, 2020

Araq commented Feb 5, 2020

timotheecour commented Feb 5, 2020 •

edited

Loading

Varriount commented Feb 6, 2020 •

edited

Loading

timotheecour commented Feb 6, 2020

Araq commented Feb 6, 2020 •

edited

Loading

timotheecour commented Feb 9, 2020 •

edited

Loading

Araq commented Feb 10, 2020

Varriount commented Feb 10, 2020

timotheecour commented Feb 10, 2020

timotheecour commented Feb 22, 2020

Araq commented Feb 24, 2020

Araq commented Apr 20, 2020

Varriount commented Apr 20, 2020

stale bot commented Apr 20, 2021

timotheecour commented Jun 8, 2021 •

edited

Loading

disruptek commented Jun 8, 2021

timotheecour commented Jun 10, 2021 •

edited

Loading

n0bra1n3r commented Oct 14, 2021 •

edited

Loading

stale bot commented Oct 14, 2022

new: typetraits.getTypeId #13305

new: typetraits.getTypeId #13305

Conversation

timotheecour commented Jan 31, 2020 • edited Loading

cooldome commented Jan 31, 2020 • edited Loading

timotheecour commented Jan 31, 2020 • edited Loading

Araq commented Feb 3, 2020

timotheecour commented Feb 3, 2020 • edited Loading

passing a callback to a proc that handles multiple types:

exportc callback that handles multiple types

Varriount commented Feb 3, 2020

timotheecour commented Feb 3, 2020 • edited Loading

Araq commented Feb 4, 2020

Varriount commented Feb 5, 2020 • edited Loading

timotheecour commented Feb 5, 2020

Araq commented Feb 5, 2020

timotheecour commented Feb 5, 2020 • edited Loading

Varriount commented Feb 6, 2020 • edited Loading

timotheecour commented Feb 6, 2020

Araq commented Feb 6, 2020 • edited Loading

timotheecour commented Feb 9, 2020 • edited Loading

Araq commented Feb 10, 2020

Varriount commented Feb 10, 2020

timotheecour commented Feb 10, 2020

timotheecour commented Feb 22, 2020

Araq commented Feb 24, 2020

Araq commented Apr 20, 2020

Varriount commented Apr 20, 2020

stale bot commented Apr 20, 2021

timotheecour commented Jun 8, 2021 • edited Loading

disruptek commented Jun 8, 2021

timotheecour commented Jun 10, 2021 • edited Loading

n0bra1n3r commented Oct 14, 2021 • edited Loading

stale bot commented Oct 14, 2022

timotheecour commented Jan 31, 2020 •

edited

Loading

cooldome commented Jan 31, 2020 •

edited

Loading

timotheecour commented Jan 31, 2020 •

edited

Loading

timotheecour commented Feb 3, 2020 •

edited

Loading

timotheecour commented Feb 3, 2020 •

edited

Loading

Varriount commented Feb 5, 2020 •

edited

Loading

timotheecour commented Feb 5, 2020 •

edited

Loading

Varriount commented Feb 6, 2020 •

edited

Loading

Araq commented Feb 6, 2020 •

edited

Loading

timotheecour commented Feb 9, 2020 •

edited

Loading

timotheecour commented Jun 8, 2021 •

edited

Loading

timotheecour commented Jun 10, 2021 •

edited

Loading

n0bra1n3r commented Oct 14, 2021 •

edited

Loading