Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Thinking about default "not nil" #6638

Open
kvinwang opened this issue Oct 31, 2017 · 30 comments
Open

Thinking about default "not nil" #6638

kvinwang opened this issue Oct 31, 2017 · 30 comments

Comments

@kvinwang
Copy link
Contributor

kvinwang commented Oct 31, 2017

I noticed that the "default 'not nil'" is no longer in the 1.0 battle plan. But this feature is really the most important one i am looking forward to. Because I think this feature will improve codes' quality in entire Nim's ecosystem.

Because it is hard to implement this feature? or Will introduce too many breaking changes?

We can introduce new types(e.g. String) which is default not nil in the stdlib, and encourage developers to use the new types. Also deprecate the old ones.

My practice:
Currently, default not nil is already throughout my app. I am feeling well because I don't have to pay much attention to handling nil values here and there.

The way i am doing this:

type String = string not nil
type SomeThingOrNil = ref object
type SomeThing = SomeThingOrNil not nil

proc doSomeThing(self: SomeThing, param: String): String =
   prove "a" & "b" 

and define some generic prove procs:

proc expect*(v: string, msg: string not nil): string not nil =
  if v.isNil:
    raiseException(ValueError, msg)
  else:
    result = v
...

proc prove*(v: string): string not nil =
  v.expect("Unwrap a nil string")

proc prove*[T](v: ref T): ref T not nil =
  v.expect("Unwrap a nil value")

proc prove*[T](v: seq[T]): seq[T] not nil =
  v.expect("Unwrap a nil value")
@Araq
Copy link
Member

Araq commented Oct 31, 2017

Will introduce too many breaking changes?

That's my fear yes. Also with the release of the "Nim in Action" book I'd like to keep version 1 compatible with the language that is described in the book. It's obvious that Nim version 2 will come sooner rather than later with breaking changes (I learned too many things during Nim's design ;-) ) and a new runtime implementation and so I thought we can do the "not nil" thing in the later versions properly.

Your code is a good idea but seems too much of a workaround to embrace it officially.

@kvinwang
Copy link
Contributor Author

kvinwang commented Nov 1, 2017

@Araq
I can imagine that Nim2 will be perfect.
But when will we reach 2.0 ? Nobody knows.

If there is a easy way to introduce default "not nil" without breaking changes, why not?
We only need do something like:

  • Introduce a "or nil" beside "not nil" to turn things into nullable.
  • Introduce String, Seq (and optional 'nref') which are default not nil.
  • Encourage people use them, in the doc.

There are 3 benefits:

  • Ease coding (Take care of nil values everywhere is a burden of mind).
  • Improve the codes quality (thirdparty codes in especial).
  • Make Nim 1to2 migration easier (Think the pain of Python's 2to3).

@kobi2187
Copy link
Contributor

kobi2187 commented Nov 8, 2017

Some languages add '?' to the type, to signify that it can accept null.
in cobra language, there is an operator 'to'. used like this: x to ? and x to !
if x is nil, and you make it not nil (exclamation !) it is like a cast and can raise exception in runtime. so it is assumed you know what you're doing.
But there is another advantage, when checking if x != nil the following block assumes the type is already non-nil and type-checking allows to omit such checks.
it's called type narrowing, or flow sensitive typing.
Other languages like ceylon, have this too. (but more verbose)
The purpose is compile-time safety for nil values.
Also, a proc can have simpler code because it doesn't need to require that, for example, a string argument is not nil. the user cannot pass a string? without this explicit (to !) or implicit narrowing (if not nil).

It could be that it complicates the language too much, to treat nil so specially.
I am just writing for reference and have no strong opinion here.

@jkaye2012
Copy link

I realize that this is an old issue, but I have been looking into Nim recently and this issue is really one of the only problems I see with the language currently from my perspective. nil is such a dangerous and pervasive source of bugs in code that defaulting everything to not accept it feels like it would make the language so much safer in general.

I understand that it may be difficult and obtrusive to make a change like this, but I do feel that it's something that should be considered!

@foliot
Copy link

foliot commented Apr 25, 2019

FYI as a source of inspiration: C# 8 is introducing non-nullable reference types after many years of not having that feature. To avoid breaking everything, you have to use a compiler flag to turn it on, and it can be disabled in regions of code as well.

I have no idea how hard this would be to do for Nim, but it could allow “default not nil” types to be introduced safely and gradually without needing a Nim 2.0.

@krux02
Copy link
Contributor

krux02 commented Apr 25, 2019

I don't think this feature has the relevance it once had. string and seq are now not nullable anymore. Only for ref-types the not nil would still make sense. But objects are better passed around as non ref types which are already not nilable. I think this is the much better solution than to sprincle everything with not nil.

@kvinwang can this issue be closed, because it really is outdated in its current form.

@bluenote10
Copy link
Contributor

But objects are better passed around as non ref types which are already not nilable

That's not possible if you need runtime polymorphism. Also in my experience that's not what most existing Nim code does.

@alehander92
Copy link
Contributor

I wouldn't close that: it's a valid issue and the non-ref workaround is good only for some cases/situations. The fact is that Nim code in the wild is full of ref/recursive types(just look at case objects) and people will keep using them all the time if the language provides them

@alehander92
Copy link
Contributor

btw @jkaye2012 @bluenote10 I've worked on an alternative notnil/ref nilcheck branch in winter https://github.com/alehander42/Nim/tree/ref-nilcheck : it was almost mergeable, and it seemed to work, but it still had some gotchas(the plan was to use a pragma to limit the checking to sections of the codebase).
on the other hand the approach was always going to miss some cases as it was mostly based on annotations + local flow analysis similar to https://github.com/isocpp/CppCoreGuidelines/blob/master/docs/Lifetime.pdf

so i am just adding this here to document an additional approach to the problem: not sure if it still makes sense noting the new ref/runtime ideas, but if somebody finds something valuable in it, it can be improved/reused

@krux02
Copy link
Contributor

krux02 commented Apr 25, 2019

Well, I never used ref in any of the projects that I started. Since Nim does not have the nil state for string and seq anymore, it really is this world where we never need to check for nil on anything. This not nil thing really seems to be a hacky solution for a problem that I simply don't have.

... the non-ref workaround is good only for some cases/situations.

The non-ref is not a workaround, it is the better default that works for most cases (all of the code bases that I started).

The fact is that Nim code in the wild is full of ref/recursive types(just look at case objects)

Well it is true that Nim code in the wild is full of ref types. I think that is mostly the fault of the bad documentation in the past that recommended ref types as a good default. I disagree on that, and I also updated the tutorial to not recommend ref types as a default anymore. It is still explained. It is also true that there are some cases of recursive types where a ref type is currently the best option, but would not nil really help here? Recursive types need to be nil-able. I don't see how not nil would improve here anything. And for case objects, they are completely orthogonal to ref types.

people will keep using them all the time if the language provides them

I hope people will use them more carefully, only when they really need to, as ref types really come with a lot of disadvantages. It is not just that they are nilable and you need think about the nil state everywhere, they also consume more memory and they are slower as they need to be allocated and deallocated on the heap.

@jkaye2012
Copy link

jkaye2012 commented Apr 25, 2019

At the end of the day, if people have an option to do something, they're going to do it. So, if ref types are an option, and ref types are nil-able by default, you're going to end up with a lot of people using nil-able ref types all over the place no matter what documentation or best practices say. This is unfortunate, but I believe it's also a proven reality.

I don't follow why anything has to be nil-able (you referenced recursive types above, but is that not an implementation detail that couldn't be changed?). The conversation should really focus more on what value nil is providing (or, conversely, what problems is it causing).

I also think it makes sense that it's not a problem you personally have. As a contributor and very experienced Nim developer, I would posit you're actually much less likely to be bitten by the kind of issue that nil causes than the average programmer picking Nim up. IMO, the idea should be to make it as easy as possible for things to be correct in the average case rather than optimizing for those who really know what they're doing.

@bluenote10
Copy link
Contributor

Well, I never used ref in any of the projects that I started.

What about NimNode?

# bad example but still...
import ast_pattern_matching
static:
  var n: NimNode
  matchAst(n)

@krux02
Copy link
Contributor

krux02 commented Apr 25, 2019

At the end of the day, if people have an option to do something, they're going to do it. So, if ref types are an option, and ref types are nil-able by default, you're going to end up with a lot of people using nil-able ref types all over the place no matter what documentation or best practices say. This is unfortunate, but I believe it's also a proven reality.

Well, when the documentation explicitly states the alternatives to ref types and the disadvantages of ref types, and the programmer is going to ignore all of that advice and still use ref everywhere, then I am sorry, there is nothing that can be done about that.

I don't follow why anything has to be nil-able (you referenced recursive types above, but is that not an implementation detail that couldn't be changed?). The conversation should really focus more on what value nil is providing (or, conversely, what problems is it causing).

Things have to be nilable, simply because you have to construct the values somehow. And what value do ref types have when they are not yet assigned to?

I also think it makes sense that it's not a problem you personally have. As a contributor and very experienced Nim developer, I would posit you're actually much less likely to be bitten by the kind of issue that nil causes than the average programmer picking Nim up. IMO, the idea should be to make it as easy as possible for things to be correct in the average case rather than optimizing for those who really know what they're doing.

Well, what is the average programmer? Someone who learned python before? Someone who learned Java before? Someone who learned Javascript before? Someone who learned c++ before? People usually complain that they can't apply their way of thinking from the programming language they come from. People who come from languages where everything is a ref apply this way of programming to Nim. And then they want the solutions from those programming languages also in Nim, even though Nim might already have better solutions for it.

@krux02
Copy link
Contributor

krux02 commented Apr 25, 2019

What about NimNode?

I did not introduce the type NimNode, I just use it, because that is what the compiler provides.

@jkaye2012
Copy link

Well, when the documentation explicitly states the alternatives to ref types and the disadvantages of ref types, and the programmer is going to ignore all of that advice and still use ref everywhere, then I am sorry, there is nothing that can be done about that.

Sure there is. Make it more difficult to do the thing that shouldn't be done than the thing that should be done. The language defaults should enforce what you would want people to do; one should not have to rely on documentation for that.

Things have to be nilable, simply because you have to construct the values somehow. And what value do ref types have when they are not yet assigned to?

A non-nillable type can never be unassigned - that's actually one of the major points of the entire exercise.

I'm don't have enough experience/knowledge with the language to know what you mean by "you have to construct the values somehow". What aspect of that requires nil?

Well, what is the average programmer? Someone who learned python before? Someone who learned Java before? Someone who learned Javascript before? Someone who learned c++ before? People usually complain that they can't apply their way of thinking from the programming language they come from. People who come from languages where everything is a ref apply this way of programming to Nim. And then they want the solutions from those programming languages also in Nim, even though Nim might already have better solutions for it.

It sounds like you might agree that most developers are going to tend towards what they already know from other langauges. If that's the case, wouldn't you want Nim to instead guide them towards the "better" way of doing things? Generally, I think that developers want to be able to get things done and that their default way of trying to do that is going to be applying what they already know (thus why they would reach for ref). If, instead, the default semantic were more aligned with how you would "want" it to work, it acts as a big red flag to that individual that they may be doing something in a not-so-great way.

@bluenote10
Copy link
Contributor

the programmer is going to ignore all of that advice and still use ref everywhere, then I am sorry, there is nothing that can be done about that.

Many things in Nim do enforce ref types:

  • recursive types
  • dynamic dispatch
  • heterogeneous collections

Or how would you solve

proc operateOn(entities: seq[Entity])

without refs?

I don't think its fair to blame users for using ref as it is often a necessity. Personally I hate using refs because of the mutability aspect and still I end up using them a lot because there is often no alternative.

@krux02
Copy link
Contributor

krux02 commented Apr 26, 2019

@bluenote10 You don't need ref types for dynamic dispatch. You can pass them as non ref types. And for your operateOn on, you can use either case-objects (this will safe a pointer indirection and you will be more cache friendly because of linear memory) or you use a different collection for each Entity type (even better for performance because it also saves you the padding costs of case-objects).

So all that is left are recursive types.

@bluenote10
Copy link
Contributor

You don't need ref types for dynamic dispatch.

In theory yes, but in practice that's not true. Even though you can define methods on non-ref types, it doesn't make much sense because you cannot upcast classes to their base type. For instance:

type
  Entity = object of RootObj
  A = object of Entity
    a: int
  B = object of Entity
    b: int

method doit(e: Entity) {.base.} =
  echo "base"
method doit(e: A) =
  echo "A"
method doit(e: B) =
  echo "B"

# yes we can define methods on them, but what's the point 
# if we cannot treat an `A` as an `Entity`?
let e: Entity = A()
# crashes at runtime with: 
# Error: unhandled exception: invalid object assignment [ObjectAssignmentError]

In my experience, dynamic dispatch is more or less useless without ref.

Case objects violate the open-closed principle and are not an option in cases that require unbounded polymorphism.

@krux02
Copy link
Contributor

krux02 commented Apr 26, 2019

require unbounded polymorphism.

Do you really require unbounded polymorphism? Or do you really want to use unbounded polymorphism no matter the costs?

@andreaferretti
Copy link
Collaborator

The typical use case for me is combinators. Things that you can put together to form a tree of things. This happens a lot, for instance while constructing:

  • parser combinators
  • routing combinators (like I do in Rosencrantz)
  • neural network layers

All these things have in common that they can be built out of smaller similar things. This requires storing them into some hetereogeneous sequence. It also requires that the user is able to define new building blocks, which requires unbounded polymorphism. And of course one needs dynamic dispatch to define recursive methods on these objects.

This use case is so important that sometimes one has to bite the bullet and use refs (or closures, or some equivalent technique)

@Araq
Copy link
Member

Araq commented Apr 29, 2019

not sure if it still makes sense noting the new ref/runtime ideas, but if somebody finds something valuable in it, it can be improved/reused

Nothing changed for ref and nil wrt the new runtime. Afaict. We need the nil state to disarm pointers but that only means these can be nil and so would require a nil check before deref. Doesn't seem to be too hard nor too cumbersome.

@timotheecour
Copy link
Member

can this be closed? only ptr/ref/pointer objects can be nil now that isNil gives a CT error for string, seq etc [1].
ptr, pointer, cstring must be able to be nil by default (if anything, for FFI), and nil should be a valid state for ref objects as well, eg for building linked lists / trees etc, without overhead of using object variants.

[1] Note that the only exception is IntSet, which we can deprecate as well (we should use isDefault (#13526) or a == default(IntSet) for that).

@zah
Copy link
Member

zah commented Apr 1, 2020

@timotheecour, it's not strictly necessary to use object variants to represent a Nilable values. One particular optimisation I'd like to see is for the Option type to detect when it's being instantiated with a not nil type and then to make the option value a regular pointer (where 0 means isNone).

You can do a similar trick for range types and perhaps through a general mechanism that's able to generate an "invalid value" of a particular type through some deterministic algorithm.

In the future, such an optimisation can become important for types such as Option[lent T] and Option[var T] that cannot be expressed in any other way.

@timotheecour
Copy link
Member

timotheecour commented Apr 1, 2020

I don't see how you'd be able to express this efficiently without nil:

type Node[T] = ref object # a linked list
  next: Node[T]
  payload: T # eg: uint8
  • object variant has overhead
  • Option[T] has overhead

You can do a similar trick for range types and perhaps through a general mechanism that's able to generate an "invalid value" of a particular type through some deterministic algorithm.

  • there is no invalid value in the payload, in the general case (eg: T: uint8); even if T is cstring, a nil payload can still be considered a valid value (in this example, you'd have a node that contains a nil cstring as payload)

@Araq
Copy link
Member

Araq commented Apr 1, 2020

I'm leaving this issue open. I'm leaning towards a secondary "type system" via DrNim.

@zah
Copy link
Member

zah commented Apr 1, 2020

@timotheecour, this example would be just

type Node[T] = ref object # a linked list
  next: Option[Node[T]]
  payload: T # eg: uint8

My point was that Option[ref T not nil] can be optimised to be the same thing as just ref T. A lot of people will dislike this proposal, because it comes with a syntactic penalty of using the more cumbersome Option type, but there is a way to address its issues as well. If the compiler can prove that Option[T] is not None in a certain if/case branch, it can allow automatic dereferencing that makes it syntactically equivalent to ref T:

let path = getOptionalFilePath()
if path.isSome:
  var f = openFile(path) # path.get (or path[]) is called behind the scenes here

If the compiler treats not nil as a pre-condition that needs to be proved, you can try to define this logic in a very generalized way that will allow user-defined types such as Option, Result/Either, SmartPtr, etc to plug into the system. All such types can be subjected to automatic dereferencing in the "not nil" case, which will make them a pleasure to use.

@alehander92
Copy link
Contributor

wouldn't it be more like path is dereferenced once and that new value used in the branch

@timotheecour
Copy link
Member

timotheecour commented Apr 4, 2020

it can allow automatic dereferencing that makes it syntactically equivalent to ref T:
All such types can be subjected to automatic dereferencing in the "not nil" case, which will make them a pleasure to use.

automatic dereferencing for Option[T], just like implicit conversions, sound great at first until you realize they're a source of endless bugs and corner cases, just look at #13790 (comment) with --experimental:implicitDeref

and here's an example with your suggested automatic dereferencing for Option[T]:

import std/options
proc fun(opt: Option) =
  assert opt.isSome
  echo opt.type.T
type Foo[T] = object
  x: int
fun(some(Foo[int](x: 1)))

this would print Foo[system.int] or int depending on whether -d:release is used.

(I'm not talking about automatic dereferencing for accessing fields of ref objects, that's a much saner language feature and many languages do this too)

@zah
Copy link
Member

zah commented Apr 4, 2020

@timotheecour, the idea is not to change the type of the symbol after the assert, but just to allow dereferencing for the purposes of signature matching (and potentially to allow that only when the not nil property has been proven). This means that a program similar to yours will compile in debug mode, but it will trigger a diagnostic in release mode, which sounds pretty correct to me.

With that said, I'm still a bit confused about your example. Why should the type operator trigger automatic dereferencing? I doesn't have any preference for regular values over option types.

FWIW, Rust has user-extensible automatic dereferencing and my understanding is that it helps them a lot in the jungle of pointer types there:

https://stackoverflow.com/questions/28519997/what-are-rusts-exact-auto-dereferencing-rules
https://doc.rust-lang.org/1.29.0/book/first-edition/deref-coercions.html

With ARC coming along, we are headed towards a similar jungle.

@iacore
Copy link
Contributor

iacore commented Oct 19, 2021

+1 for non nil ref by default and Option[ref A] if you need null. Probably the same for ptr?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests