Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clarify clone, new_ref, duplicate and duplicate_deep semantics #712

Open
Bromeon opened this issue Mar 17, 2021 · 11 comments
Open

Clarify clone, new_ref, duplicate and duplicate_deep semantics #712

Bromeon opened this issue Mar 17, 2021 · 11 comments
Labels
breaking-change Issues and PRs that are breaking to fix/merge. c: core Component: core (mod core_types, object, log, init, ...) documentation quality-of-life No new functionality, but improves ergonomics/internals
Milestone

Comments

@Bromeon
Copy link
Member

Bromeon commented Mar 17, 2021

From #711:

@Bromeon:

Quick question regarding method names, we have

  • Variant::clone()
  • GodotString::clone()
  • Dictionary::duplicate()
  • VariantArray::duplicate()
  • VariantArray::duplicate_deep()

And for ref-increment we use new_ref(), if I'm not mistaken.

From a user perspective, maybe we should highlight this somewhere in the documentation?

Is there a semantic difference between clone and duplicate[_deep] or is the idea "when it supports both shallow and deep copy, use duplicate/duplicate_deep, otherwise use clone?

@toasteater:

I'm aware that it's a bit messy right now. For clone vs duplicate, my understanding is that Clone on a reference-counted type should be a reference increment (i.e. not even a shallow copy), to be consistent with standard types like Rc, Arc, even though the reference-counting is "internal" in the case of variant collections. That does make NewRef seem redundant, and I think that would warrant a separate issue for discussion.


I agree with the Rc consistency, but keep in mind that in the standard library, the boundary between value type and reference-counted type is much clearer than in Godot. Rc's only function is to reference-count a value, it's even in the name. For something like VariantArray or Dictionary, it's more an implementation detail due to how Godot works, and it wouldn't surprise me if some users expected them to behave like standard collection types (i.e. values).

So we make a choice by being consistent with either smart pointer types, or with collection types. Dictionary and friends seem to me closer to collections than to smart pointers.

In the Rc documentation, it is even recommended to use the associated function call syntax, to make it clearer that no deep copy is performed:
grafik

This indicates to me that the Rc::clone() method is indeed considered a special case, and the x.clone() syntax generally means "copy the whole contents of x".

@ghost ghost added c: core Component: core (mod core_types, object, log, init, ...) documentation feature Adds functionality to the library labels Apr 27, 2021
@ghost ghost added this to the 0.10 milestone Apr 27, 2021
@Waridley
Copy link
Contributor

Waridley commented Jun 9, 2021

There are some examples of internal reference-counting throughout the Rust ecosystem, e.g. wasmtime, though there aren't many (any?) in std.

I have thought recently that there kind of should have been 4 distinctions in Rust for copy/clone semantics: cheaply trivially copyable (register load), trivially mem-copyable (a memcpy results in a valid standalone object, even if it's big), and deep-clone vs reference-counting. But alas, we're stuck with what we've got.

The only function in std I know of that was designed to not take self specifically so that it doesn't shadow any function names of what it dereferences into is Box::leak. I don't know if this was a result of wishing they had done this for Rc::clone, or if they felt it was more important for leak for some reason.

@Waridley
Copy link
Contributor

Waridley commented Jun 9, 2021

Also, there are a lot of users that get started with godot-rust by trying to convert GDScript code to Rust, so keeping the API close to what it is in GDScript has a lot of value.

@Bromeon
Copy link
Member Author

Bromeon commented Jun 9, 2021

I have thought recently that there kind of should have been 4 distinctions in Rust for copy/clone semantics: cheaply trivially copyable (register load), trivially mem-copyable (a memcpy results in a valid standalone object, even if it's big), and deep-clone vs reference-counting. But alas, we're stuck with what we've got.

I don't see much benefit for register/memcpy differentiation, as this will also depend on the architecture and only differs in performance, not semantics. But for the others I agree, and additionally some types differ between deep- and shallow-clone (see below).

Also, there are a lot of users that get started with godot-rust by trying to convert GDScript code to Rust, so keeping the API close to what it is in GDScript has a lot of value.

The same argument can be made for people already familiar with Rust, and picking up Godot as an engine. Furthermore, staying idiomatic in the target language usually leads to less rough edges than trying to port one language's idioms to another. But of course there are different opinions, a compromise makes sense here to reduce interop friction on both ends of Godot <-> godot-rust <-> Rust.


My attempt at categorizing different copy semantics goes as follows:

1. Simple types (Vector3, Color, ...)

Bitwise copies are the way to go. Rust does this with the Copy trait, and thus implicitly Clone.

2. Ref<T> with Godot classes (Object, Reference, Node, ...)

These types are only ever referred to by reference, a user cannot hold values. Most of the time, the instances behind those references are not copied. For cases where deep-copying is needed, the custom function Node.duplicate() should probably suffice.

Some of those classes (namely the Reference derivates) are reference-counted in Ref<T>.
This leaves two options to explicitly increase the ref-count:

  • value.new_ref(), stating explicitly that nothing is cloned.
  • value.clone(), analogous to Rust's Rc.

While the 2nd can be written as Ref::clone(value) to clarify that there's no deep-copy, the 1st one new_ref() is more expressive from the start. The disadvantage of new_ref() however is that the standard library and third party libraries don't know this trait, so something like Vec<Ref<T>> would not be clonable.

3. Ref-counted collections (VariantArray, Dictionary)

Note:
Rust VariantArray == GDScript Array
Rust TypedArray<*> == GDScript Pool*Array

In GDScript, we have Array.duplicate(deep: bool=false) and Dictionary.duplicate(deep: bool=false). Additionally, other methods such as the constructors from pool arrays or append_array() suggest that when copying is involved, shallow copies are the default. For simple types as elements, there is no difference anyway.

Both VariantArray and Dictionary are reference-counted in Rust. They use the Access type-state to infer whether mutating operations are safe.

Both currently implement duplicate() (shallow-copy), but as of 0.9.3 not duplicate_deep() or clone() (there was a PR about this).

4. CoW collections (TypedArray)

The generic equivalent of GDScript's Pool*Array classes is a bit different:

  • clone() with deep-copy semantics (in fact CoW, but from a user perspective it behaves like a deep copy as soon as mutation occurs).
  • new_ref() for explicit ref-count increment.
  • Special scope-based write() and read() methods, allowing to batch multiple operations in one CoW copy

Since TypedArray doesn't hold complex types but rather Copy ones like Vector3, it doesn't need to differentiate between deep- and shallow-copy.


Rust methods that do the same thing, should ideally have the same name (and vice versa).
This is not always possible, so here my personal opinion:

  • 1. simple types:
    • r.clone() already does the right thing, nothing else needed.
  • 2. Ref<T> with Object hierarchy:
    • r.new_ref() to increment ref-count, no clone()
      If this is too restrictive as it makes Ref non-clonable, then r.clone() with the suggestion to use Ref::clone(r).
    • existing r.duplicate() as a special case for nodes (this is not really cloning, as it also takes extra arguments)
  • 3. ref-counted collections:
    • c.new_ref() to increment ref-count
    • c.clone() for shallow-copy
    • c.clone_deep() for deep-copy
  • 4. CoW collections:
    • c.new_ref() to increment ref-count
    • c.clone() for shallow-copy (= deep-copy)

@Waridley
Copy link
Contributor

Waridley commented Jun 9, 2021

clone() with deep-copy semantics (in fact CoW, but from a user perspective it behaves like a deep copy as soon as mutation occurs).

It's not quite deep-copy with StringArray, right? I'm assuming since GodotString is reference-counted, it would just increment the ref count on each string.

@Waridley
Copy link
Contributor

Waridley commented Jun 9, 2021

My personal opinion is that Ref::clone should stay. I think it's similar enough to Rc that users would intuitively expect that it's cheap to clone, and it's pretty rare that you would want to duplicate anything it references.

For collections, my opinion is a bit less clear. I'd almost want something like a collection-specific reference type, so CollectionRef<Dictionary> (with a better name) could be cloned to increment the ref count, while dereferencing it to get &Dictionary would allow duplicating it with .clone() (then we'd still have to debate whether this should be shallow or deep). This would be quite different from the GDScript API, but it seems like dictionaries and arrays want to behave like HashMaps and Vecs, and if you ever want to reference-count those you wrap them in an Rc or Arc.

@ghost
Copy link

ghost commented Jun 10, 2021

I'm assuming since GodotString is reference-counted, it would just increment the ref count on each string.

GodotString is also immutable, so there's no semantic difference between shallow- and deep-copying.

@Bromeon Bromeon mentioned this issue Aug 27, 2021
5 tasks
@Bromeon Bromeon added the breaking-change Issues and PRs that are breaking to fix/merge. label Aug 29, 2021
@Waridley
Copy link
Contributor

The more I think about this, the more I think something akin to Ref<Dictionary>, etc. would be beneficial. Ref is already used for internally reference-counted types. Even though that's due to the unfortunate possibility of any GodotObject reference being either ref-counted or manually-managed, I think it would feel more consistent to do the same for collections, rather than put the Access parameter on the collection type itself. I'm still undecided whether the collections themselves should implement Clone, or just have the duplicate method, though. Leaning towards duplicate to prevent accidental deep clones.

@Bromeon
Copy link
Member Author

Bromeon commented Sep 16, 2021

That's an interesting take. It would imply quite some refactoring of the type system though, as Ref & Co. are currently meant for GDNative classes (the Object hierarchy). They're quite a bit more than just glorified Rc smart pointers, a large part of the methods has to do with casting and accessing Object-specific functionality.

Might be doable with another type state, but definitely needs more thinking 💡

@Waridley
Copy link
Contributor

Waridley commented Sep 16, 2021

@Bromeon yeah, it's possible the refactoring effort alone would make it not worth the change. However, I am thinking a separate type that's similar to Ref could be used specifically for collections, without changing Ref itself. Coming up with a short name that makes sense but can't be confused with TRef is difficult, though 😆. Ref probably could be re-used if it was like this from the beginning, but I do think it would be too difficult to change now.

edit: to be clear, the new type would also have to have a name that reminded users of Ref in order for the consistency benefits to be meaningful at all. My goal is to remove one thing for users to learn (that collections have an Access parameter that defaults to Shared because they are internally reference-counted)

@Bromeon
Copy link
Member Author

Bromeon commented Sep 16, 2021

I'm not opposed to change if it makes the resulting type system simpler. The relation between Ref, TRef, Instance, RefInstance may not be the most intuitive, and the Access type state adds another dimension to this. But a lot of the complexity exists in godot-rust due to lenient C++ rules, and because the alternative would be pushing complexity onto the user's code.

It sometimes makes sense to reflect upon a design based on first principles, and see if components can be further decoupled. We should definitely try to reuse the knowledge that went into the current design. The current model is already great at isolating some behaviors:

  • Availability -- expressed through smart pointer type
    • "weak reference" (Ref, tracking the Godot object)
    • "strong reference" (TRef, temporarily accessing the object)
    • "direct access" (&T, basically dereferenced TRef but without the book-keeping)
  • Ownership -- expressed through Access type state
    • shared (multiple owners)
    • thread-local (multiple owners in 1 thread)
    • unique (exactly 1 owner)
  • Memory -- expressed through T type parameter
  • Copy semantics -- implied by base type (smart pointer, Dictionary, etc.)
    My comment above attempts to categorize types based on copy semantics.

Maybe there are more layers, or the existing ones can be arranged differently?
Note that these boundaries are a bit different than with Rust's own model -- but the latter could also be used as inspiration.

@Bromeon Bromeon added quality-of-life No new functionality, but improves ergonomics/internals and removed feature Adds functionality to the library labels Oct 30, 2021
@chitoyuu
Copy link
Contributor

I think the idea of using a wrapper type to indicate variant reference counting sounds interesting. The behavior of deep duplication wrt. objects is consistent with the cloning of Rust collections that contain Refs, which is also nice (and a good reason to keep Ref::clone).

The biggest issue with this approach seems to be the extra cognitive burden of learning "by heart" yet another *Ref type. Adding the wrapper in the type signature also makes usage of the variant types more verbose.

Another possible point of contention I see is whether we should allow users to obtain bare versions of the values. This needs to be allowed if we want to implement Clone for the bare types through duplicate(deep = true) as proposed here in #712 (comment), but doing so can be seen as inconsistent with what is done for the API objects. This also leads to the question of what an owned value should mean, since the variant types are always reference-counted internally. We could interpret them as values with a reference count of exactly 1, but that would result in further confusion with BikeshedRef<T, Unique>.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
breaking-change Issues and PRs that are breaking to fix/merge. c: core Component: core (mod core_types, object, log, init, ...) documentation quality-of-life No new functionality, but improves ergonomics/internals
Projects
None yet
Development

No branches or pull requests

3 participants