Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Memory type #51319

Merged
merged 8 commits into from
Oct 27, 2023
Merged

Add Memory type #51319

merged 8 commits into from
Oct 27, 2023

Conversation

oscardssmith
Copy link
Member

@oscardssmith oscardssmith commented Sep 14, 2023

@brenhinkeller brenhinkeller added the arrays [a, r, r, a, y, s] label Sep 14, 2023
@vtjnash vtjnash marked this pull request as draft September 14, 2023 19:17
base/array.jl Show resolved Hide resolved
base/boot.jl Show resolved Hide resolved
base/boot.jl Outdated Show resolved Hide resolved
base/essentials.jl Outdated Show resolved Hide resolved
## genericmemory.jl: Managed Memory

"""
GenericMemory{isatomic, T} <: AbstractVector{T}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a minor note that the name GenericMemory doesn't really make it obvious that this subtypes AbstractArray. Perhaps that's intentional (i.e. we don't want people to treat these too strongly as "arrays").

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really think this shouldn't subtype AbstractVector. I doubt you want to use a "named" GenericMemory as input to all kinds of linear algebra functions etc. This should IMO be a very "basic" type with not many methods defined on it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The punning of "mathematical vector" vs. "generic store of a collection of objects" on AbstractVector is indeed a bit unfortunate. This could be a good opportunity to disentangle the two - the question is just, what to do with existing uses of AbstractVector? We may have to accept the punning, as GenericMemory really does fulfill the minimal required interface..

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the question is just, what to do with existing uses of AbstractVector?

Nothing? All I am saying is that I don't think the existing generic methods written for AbstractArray will be very useful for GenericMemory. Even though you can implement size and getindex, it doesn't mean you have to make it an AbstractArray because you then opt into things that you might not want.

Many methods on AbstractArray return an Array unless you specialize them but it seems that is rarely something you want to happen if you have a GenericMemory. So people will start overloading these methods to return a GenericMemory etc and this just doesn't seem like something you want to deal with.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes sense, yes. I'm on board with keeping this as its own thing, and having actual array semantics on different types on top of this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In fact, it is much more descriptive of a mathematical vector than Vector was, since Vector adds lots of features as "general collection of objects" which are not mathematically-oriented and do not exist for Memory.

You want to have a basic memory type that is free from even being thought about in the same way as a mathematical vector. If you want you can wrap something on top of Memory that you consider an AbstractVector but the core memory type should not be passable into ForwardDiff.gradient and to .+ and a bunch of other stuff that doesn't make sense for it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the counterpoint is that given that it has a size, and getindex, you will get sensible results out of all of those.

Copy link
Member

@KristofferC KristofferC Sep 15, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's not really a counterpoint. A Char implements those method but is not a subtype of AbstractArray. The question is not what methods it implements, it is what this thing actually represents. And I think this should be free from any association with anything "mathy". ::Memory + ::Memory doesn't really make sense at a conceptual level.

Just to bring up some questions that need to be answered:

  • What should ::Memory .+ ::Memory return?
  • What parts should Vector implement that Memory does not; is the only difference between Vector and Memory that Vector has size modifying functions?
  • Should ForwardDiff.gradient now return Memory (but ForwardDiff.hessian still has to return a Matrix)? The argumentation above ("it is much more descriptive of a mathematical vector than Vector") seems to suggest it should.
  • Can you pass Memory to OpenBLAS?
  • Will almost everything that now takes an Vector start taking Union{Vector, Memory}? Should that apply to the ecosystem at large?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only to answer the last question, which may answer most of the others, they should be taking a DenseArray already, so no changes are needed for them. (Except that reminds me that this should really be a DenseVector, and not merely an AbstractVector. Usually I don't bother to express that level of granularity in array APIs, so it had slipped my mind.)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

which may answer most of the others

I don't really see how the other questions are answered from this.

@mikmoore
Copy link
Contributor

Sorry for the bikeshedding, but given that we already have RefArray and RefValue, RefMemory seems more consistent than the proposed MemoryRef. Although I'll admit that I like XyzRef better than RefXyz in a vacuum. In any case, very exciting!

@JeffBezanson
Copy link
Member

Sorry for the bikeshedding, but given that we already have RefArray and RefValue, RefMemory seems more consistent than the proposed MemoryRef. Although I'll admit that I like XyzRef better than RefXyz in a vacuum. In any case, very exciting!

That's a good point, but I reason that RefArray and RefValue are not used directly very often and MemoryRef serves a slightly different purpose, plus I agree is generally a better name. Although like RefArray a MemoryRef points to an element, its real purpose is to capture a bounds-checked index.

@Seelengrab
Copy link
Contributor

Some questions that came up while reading the design document - documenting here so they're not lost to the triage hole:

  • How is a :const memory going to be initialized by a user? In embedded, it's useful & common to have some global immutable constant that's already initialized at link time.
  • Will the memory model of atomics be documented together with the documentation of this atomic memory? See also Document memory model of Per-field atomics #46739
  • Could this support address spaces from LLVM? (I think we talked about this a bit in the context of GPU & embedded, where we'd otherwise have to reimplement a whole bunch of stuff)
  • Can I take a view of a GenericMemory? (Does that even make sense..?)

@oscardssmith
Copy link
Member Author

  1. this would probably require a special constructor that takes a function that takes an Int and produces a T (or we could make a copy constructor from a non-const Memory
  2. it really should be
  3. yes. A 1 element view of a GenericMemory is a MemoryRef. A multi-element view of a GenericMemory is another GenericMemory owned by the parent GenericMemory (see jl_genericmemory_slice)

@Tokazama
Copy link
Contributor

I'm trying to help where I can with review and documentation here, but it's a lot to sift through. Do we still need to bike-shed the resize/grow/extend stuff, or is there something else more helpful you'd like more eyes on?

@vtjnash
Copy link
Member

vtjnash commented Sep 19, 2023

We are slowly cranking through the todo list items listed in the commit comments (like making sure everything is documented and tested), and making sure all of the tests continue to pass. If people want to help, they are welcome to claim items via Slack for real-time collaboration on them

base/essentials.jl Outdated Show resolved Hide resolved
base/genericmemory.jl Outdated Show resolved Hide resolved
base/Base.jl Outdated
const DL_LOAD_PATH = String[]
let os = ccall(:jl_get_UNAME, Any, ())
if os === :Darwin || os === :Apple
if Base.DARWIN_FRAMEWORK

This comment was marked as resolved.

@aviatesk aviatesk self-assigned this Sep 21, 2023
base/Base.jl Outdated
const _included_files = Array{Tuple{Module,String},1}(Core.undef, 1)
# start this big so that we don't have to resize before we have defined how to grow an array
const _included_files = Array{Tuple{Module,String},1}(Core.undef, 50)
_included_files.size = (1,)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a small thing — perhaps we should specialize getproperty to make Array look fully opaque from the REPL (and require get/set field for this kind of internals muckery). Given other languages, arr.size is a really tempting footgun.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah that's a good point. As nice as it made coding these, they should probably require a setfield

Copy link
Contributor

@Seelengrab Seelengrab Sep 21, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps a name other than size could serve an additional purpose, to further reduce "just access this"? Something like dims_tuple?

Maybe not a good idea..

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does setfield has to do with REPL autocompletions?

And what's wrong with Matt's original suggestion again?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what I mean is that if we overload getproperty (and presumably setproperty) you would need setfield to modify the size. I was agreeing with the original suggestion.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't it enough to overload propertynames which the REPL use for autocomplete?

base/array.jl Show resolved Hide resolved
@vtjnash vtjnash force-pushed the jn/MemoryT branch 2 times, most recently from 3f2138d to 80a67bc Compare September 25, 2023 15:37
@aviatesk aviatesk removed their assignment Sep 25, 2023
@vtjnash vtjnash force-pushed the jn/MemoryT branch 2 times, most recently from b524983 to e89d249 Compare September 25, 2023 23:26
vchuravy pushed a commit that referenced this pull request Oct 28, 2023
This was recently refactored in #51319 and while trying to adapt
`libcxxwrap-julia` to the new changes I got:
```
ld: CMakeFiles/test_module.dir/test_module.cpp.o: in function `jl_datatype_layout':
test_module.cpp:(.text+0x2c2): undefined reference to `jl_unwrap_unionall'
```

The function `jl_datatype_layout` is `STATIC_INLINE` in `julia.h`:

https://github.com/JuliaLang/julia/blob/98542d748540e2390d6222282749c7dd534544da/src/julia.h#L1304-L1312
`jl_unwrap_unionall` is marked `JL_DLLEXPORT` here but it doesn't appear
in the exports of `libjulia`.
@fingolfin
Copy link
Contributor

Seems like a great improvement, but unfortunately it also broke the Julia C API. As a consequence CxxWrap (and probably a few other things) don't work anymore and need to be fixed. @benlorenz started to work on fixing it, see JuliaInterop/libcxxwrap-julia#137, but it's not yet working.

To a degree, I understand why API breakage keeps happening, but it'd be nice if we could slowly over time start to stabilize some APIs and make some promises about keeping them around a bit longer -- APIs for accessing arrays seem like a prime candidate for that.

Otherwise, certain packages which are not very actively maintained will just rot away over time, I am afraid.

@giordano giordano removed the merge me PR is reviewed. Merge when all tests are passing label Oct 30, 2023
yuyichao added a commit to JuliaPy/PyCall.jl that referenced this pull request Oct 30, 2023
`cconvert` does not return an array anymore and cannot be used with `reinterpret`.
Fix to use the underlying `transcode` function directly,
which is also consistent with the `Cstring` version.
stevengj pushed a commit to JuliaPy/PyCall.jl that referenced this pull request Oct 30, 2023
* Fix breakage due to JuliaLang/julia#51319

`cconvert` does not return an array anymore and cannot be used with `reinterpret`.
Fix to use the underlying `transcode` function directly,
which is also consistent with the `Cstring` version.

* Fix doc test
aviatesk added a commit to aviatesk/JET.jl that referenced this pull request Oct 31, 2023
Drvi added a commit to RelationalAI/julia that referenced this pull request Nov 1, 2023
maleadt pushed a commit that referenced this pull request Dec 4, 2023
@knuesel knuesel mentioned this pull request Feb 22, 2024
fatteneder added a commit to fatteneder/julia that referenced this pull request Mar 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
arrays [a, r, r, a, y, s] performance Must go faster
Projects
None yet
Development

Successfully merging this pull request may close these issues.

push!/pop! always does a ccall