Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support matrix multiplication (Continue #93) #146

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

jishnub
Copy link
Member

@jishnub jishnub commented Sep 14, 2020

This PR continues #93, along the lines of the strategy 2 mentioned here, ie. the axes of the two terms have to be compatible. Now these are possible:

julia> a = [1 2; 3 4];

julia> oa = OffsetArray(a, (2, 2))
2×2 OffsetArray(::Array{Int64,2}, 3:4, 3:4) with eltype Int64 with indices 3:4×3:4:
 1  2
 3  4

julia> oa * oa
2×2 OffsetArray(::Array{Int64,2}, 3:4, 3:4) with eltype Int64 with indices 3:4×3:4:
  7  10
 15  22

julia> v = [5, 6];

julia> ov = OffsetVector(v, (2,));

julia> oa * ov
2-element OffsetArray(::Array{Int64,1}, 3:4) with eltype Int64 with indices 3:4:
 17
 39

julia> oa2 = OffsetArray(a, (0, 0));

julia> oa2 * a
2×2 Array{Int64,2}:
  7  10
 15  22

julia> oa2 * v
2-element OffsetArray(::Array{Int64,1}, 1:2) with eltype Int64 with indices 1:2:
 17
 39

julia> v' * oa2
1×2 Adjoint{Int64,Array{Int64,1}}:
 23  34

julia> ov' * ov
61

Still WIP, doesn't play well with Adjoint and Transpose yet.

To-do:

  • oa' * ov
  • oa' * oa

Solving these might be possible only with an explicit dependence on LinearAlgebra. I guess that's ok?

This is actually somewhat less ambitious than the original PR, as product with arbitrary matrices is not handled here and defaults to LinearAlgebra.

@codecov
Copy link

codecov bot commented Sep 14, 2020

Codecov Report

Merging #146 into master will increase coverage by 0.63%.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #146      +/-   ##
==========================================
+ Coverage   93.60%   94.23%   +0.63%     
==========================================
  Files           2        2              
  Lines         250      260      +10     
==========================================
+ Hits          234      245      +11     
+ Misses         16       15       -1     
Impacted Files Coverage Δ
src/OffsetArrays.jl 94.90% <100.00%> (+0.73%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 93e60d3...0529fe8. Read the comment docs.

Copy link
Member

@johnnychen94 johnnychen94 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The additional overhead comes from #144, I think we may need a way to opt-out the overflow check.

Solving these might be possible only with an explicit dependence on LinearAlgebra. I guess that's ok?

This looks okay to me.

src/OffsetArrays.jl Outdated Show resolved Hide resolved
src/OffsetArrays.jl Outdated Show resolved Hide resolved
Ryan Bennink and others added 2 commits September 15, 2020 09:37
- New methods for * for combinations of OffsetArray and AbstractArray
  Unfortunately this introduces method ambiguities with Base.
- As an alternative, tried defining a new promote_rule, but somehow it doesn't get invoked.

New convenience constructors
- Wrap any (non-offset) array as an OffsetArray with standard indices
- Remove layer of indirection when constructing with explicit offsets

Cosmetic edits to constructor section for improved readability
@jishnub
Copy link
Member Author

jishnub commented Sep 15, 2020

I'm starting to think that we should not solve the matrix multiplications involving adjoint and transpose here. These should really be solved in LinearAlgebra. The reason some of these work is that LinearAlgebra supports these.

In any case I'm not too sure how to approach arbitrarily nested OffsetArrays wrt dispatch.

@johnnychen94
Copy link
Member

johnnychen94 commented Sep 15, 2020

I have the same feeling that such kind of support (including the ones you've added) should be done upstream: Base, LinearAlgebra or so. Supporting these operations in OffsetArrays is like Whack-a-mole.

Would like to get @mcabbott involved for some advice

@mcabbott
Copy link

Yes, would be nice if this worked. My impression is that * and mul! are teetering right on the edge of what's possible with multiple dispatch without going crazy, and so that trying to make this work by adding methods might be too fragile. I haven't looked, but could it be done somewhere lower down? Keep everything generic, down to similar(..., axes(A,1), ...), and then check once somewhere inside mul! before passing on down to gemm! and friends?

@johnnychen94
Copy link
Member

johnnychen94 commented Sep 15, 2020

I think what's missing here is a protocol in LinearAlgebra to generically deal with array with offsets. It currently only disabled such ability.

The one I just made is:

# LinearAlgebra
offsetted(A::AbstractArray, offsets) = A
unoffsetted(A::AbstractArray) = A, ntuple(_->0, ndims(A))

# OffsetArrays

# this isn't right because it override the function definition
offsetted(A::AbstractArray{T, N}, offsets) where {T, N} = OffsetArray{T, N, typeof(A)}(A, offsets)
unoffsetted(A::OffsetArray) = A.parent, A.offsets

The name is perhaps not good, but you get the idea.

For a simple experiment for 2*2 matrix:

# https://github.com/JuliaLang/julia/blob/8bdf5695e25e77d53f8934d9f135858d174e30d1/stdlib/LinearAlgebra/src/matmul.jl#L914-L950

+ offsetted(A::AbstractArray, offsets) = A
+ unoffsetted(A::AbstractArray) = A, ntuple(_->0, ndims(A))
+ 
 function matmul2x2!(C::AbstractMatrix, tA, tB, A::AbstractMatrix, B::AbstractMatrix,
                     _add::MulAddMul = MulAddMul())
+     A, Ao = unoffsetted(A)
+     B, Bo = unoffsetted(B)
+ 
+     if !(Ao[2] == Bo[1])
+         throw(ArgumentError("offset doesn't match"))
+     end
-  require_one_based_indexing(C, A, B)

 ...
-      C
+     offsetted(C, (Ao[1], Bo[2]))
end

# OffsetArrays
+ import LinearAlgebra
+ LinearAlgebra.offsetted(A::AbstractArray{T, N}, offsets) where {T, N} = OffsetArray{T, N, typeof(A)}(A, offsets)
+ LinearAlgebra.unoffsetted(A::OffsetArray) = A.parent, A.offsets

and it seems work pretty well:

julia> ao = ones(0:1, 0:1)
2×2 OffsetArray(::Matrix{Float64}, 0:1, 0:1) with eltype Float64 with indices 0:1×0:1:
 1.0  1.0
 1.0  1.0

julia> ao * ao
2×2 OffsetArray(::Matrix{Float64}, 0:1, 0:1) with eltype Float64 with indices 0:1×0:1:
 2.0  2.0
 2.0  2.0

I'm not very sure if this goes to the BLAS call for large size cases. Performance might be an issue. This is just some simple experiment, I haven't yet checked it.

Any ideas on how to properly dispatch the offsetted protocol?

@mcabbott
Copy link

Can't this be simpler though? mul! simply has to check that axes(A,2) == axes(B,1) etc, and then strip everything off (I guess with some unoffset function which OffsetArrays can extend), before proceeding as usual. No problem if only A is an OffsetArray, as long as OneTo(10) == 1:10 etc.

And * makes a new array before calling mul!, which it can do with something like C = similar(A, axes(A,1), axes(B,2)) etc, instead of size. This results in C:: OffsetArray, which is sure to pass mul!'s checks. It never has to put a wrapper back on the result.

At least for ordinary Arrays, I guess StaticArrays's * doesn't go to mul!. Not sure about sparse.

@johnnychen94
Copy link
Member

johnnychen94 commented Sep 16, 2020

A lot of codes in LinearAlgebra preassume it being base-1 indexes. Only stripped the offset off might not be sufficient and requires a lot of more effort to make things correct.

For example,

# https://github.com/JuliaLang/julia/blob/8bdf5695e25e77d53f8934d9f135858d174e30d1/stdlib/LinearAlgebra/src/matmul.jl#L927-L928
        A11 = copy(A[1,1]'); A12 = copy(A[2,1]')
        A21 = copy(A[1,2]'); A22 = copy(A[2,2]')

@timholy
Copy link
Member

timholy commented Sep 19, 2020

I really don't think this should be fixed here. It just needs someone to go through LinearAlgebra and start generalizing the 1-based assumptions. I'm not saying that's a small job (there's a reason it hasn't been done yet), but it's the only reasonable path to get to where we want to be.

@mcabbott
Copy link

Here's a sketch of what I was trying to say above:

https://gist.github.com/mcabbott/03225572217660aa7694f46caaa6e15f

There aren't so many calls to similar which need to be changed from size to axes. Anything with offsets will miss the StridedMatrix etc methods of mul!, and can be caught just before it goes to generic_matmatmul!. OffsetArrays would just need to overload one function remove_offset_indexing, and of course similar. I presume if Base.has_offset_axes(C, A, B) will get compiled away.

Is there a flaw in this plan?

@timholy
Copy link
Member

timholy commented Sep 19, 2020

That's not bad; it's certain more generic than having each package that uses offset axes implement the same linear algebra functions over and over again. But fundamentally I don't see a need to "strip" unconventional axes; indeed, some day (perhaps when Array gets rewritten in pure Julia) maybe even Julia's own default arrays will support offset axes. In such circumstances there may not be an outer container to strip away: in that case https://gist.github.com/mcabbott/03225572217660aa7694f46caaa6e15f#file-offset_matmul-jl-L71 may return arrays of the same type as the inputs, and then how do you avoid a StackOverflowError?

To me it seems the sensible thing to do is start going through the multiplication routines and generalize them to use firstindex rather than 1, and lastindex instead of size(A, n). Then you're designing algorithms based on behavior rather than on types, which is usually more generalizable across changes in implementation elsewhere in the ecosystem.

@mcabbott
Copy link

mcabbott commented Sep 19, 2020

Thanks for having a look. I guess this is an attempt to change as little as possible. It wouldn't preclude doing a better matmul3x3! later... but wouldn't you would still need something like this unwrapping story before handing pointers to gemm!, for cases when you can?

The contract for remove_offset_indexing is that it has to return an array for which has_offset_axes is false. It can come with a warning sign... But as long as you don't define a method for this, it should fall back the same test and the same error as you get now:

using AxisArrays # a wrapper with no remove_offset_indexing method
aa = AxisArray(rand(2,2), x='a':'b', y=10:11)
@which aa' * aa # AbstractArray, no special methods, no special axes

ao = AxisArray(zeros(0:1, 3:4), x='a':'b', y=10:11) # with indices 0:1×3:4
ao' * ao # ArgumentError: offset arrays are not supported

It's a bit awkward that there's no simple way here for the method for OffsetArrays to be employed despite this not being the outermost wrapper. Maybe there's a better design?

If mul! called something like remove_offsets(A, axes(A)), then perhaps OffsetArrays could overload instead remove_offsets(::Any, ::Tuple{IdOffsetRange}) instead, and apply the inverse wrapper... that seems a bit crazy, and won't dispatch to ::StridedMatrix etc.

@timholy
Copy link
Member

timholy commented Sep 19, 2020

unwrapping story before handing pointers to gemm!, for cases when you can?

Since gemm! only needs pointers, you still don't have to explicitly strip the axes, as long as there is an appropriate pointer method.

@mcabbott
Copy link

I guess I was thinking of the pointer as being the maximally unwrapped form, just the memory (& the eltype). Before you pass that along, you'd need some logic to be sure they are all in the same convention. I suppose that is why pointer isn't defined for OffsetArrays, nor for Adjoints of complex arrays?

@timholy
Copy link
Member

timholy commented Sep 20, 2020

Well, mostly I think there hasn't been any reason to have it so far...because matrix multiplication hasn't been supported.

But again, "maximally unwrapped" is a property of our current implementations, not a generic property.

struct ZeroIndexedSVector{L,T} <: AbstractVector{T}
    data::RawBuffer{UInt8}                # RawBuffer needs to be implemented in Julia (gc management) but is coming some day...
end
Base.axes(::ZeroIndexedSVector{L}) where L = (ZeroRange(L),)  # https://github.com/JuliaArrays/CustomUnitRanges.jl/blob/master/src/ZeroRange.jl

does not have a wrapper you can strip---instead you have to wrap it with something else.

@mcabbott
Copy link

OK, good example. It's a bit ugly if remove_offset_indexing(:: ZeroIndexedSVector) adds an OffsetArray wrapper, just to feed the existing generic_matmul! something it understands.

The other objection to this design is that, with my AxisArray example above, it ought to be able to tell what to do just from seeing Base.axes(ao). If mul! knows how to send OffsetArrays to gemm!, then it should know here. But I don't see a simple way to make that work without unwrapping. You'd need something like pointer and strides to be defined for the wrapper type, so the package still has to co-operate. And you'd seemingly have to re-write all the code which decides whether or not to call BLAS, which right now dispatches on StridedMatrix etc. I guess this leads to JuliaLang/julia#25558 (ArrayLayouts.jl) and JuliaLang/julia#30432 (strides) etc.

@timholy
Copy link
Member

timholy commented Sep 20, 2020

pointer and strides can be implemented. Often both strided and non-strided arrays can be wrapped by any given type, but if you defer to the child's implementation then it all works out naturally. A good example is SubArray: strides(V.parent) might throw an error, if the parent itself isn't strided. OffsetArrays can mimic this design.

@mcabbott
Copy link

Yes to strides-of-the-parent, and JuliaLang/julia#30432 seems like a very neat mechanism to let this pass information about non-stridedness too.

Not sure I'm volunteering to re-write mul! in terms of that just yet though! But xref FluxML/NNlib.jl#191 which is an attempt to do this for batched_mul!.

@simeonschaub
Copy link

What's the current state of this? It would be really nice if matrix multiply just worked with offset SArrays. Does this actually break cases that worked before?

@timholy
Copy link
Member

timholy commented Nov 3, 2020

I'm pretty firmly of the opinion that most of the work needs to be done in LinearAlgebra and/or Base, not here. I know folks probably think the magic of OffsetArrays mostly happens here, but that's not really true (plus about a gazillion bugfix PRs submitted by lots of other people after that).

@simeonschaub
Copy link

Oh yes, I definitely agree with you! This PR in its current form looks like a worthwhile improvement, at least to me, though, but I might have missed something in the discussions. Or does this cause method ambiguities, even without trying to support Transpose and Adjoint?

@timholy
Copy link
Member

timholy commented Nov 4, 2020

If it doesn't break anything now, and its later removal doesn't break anything later, then we could use this as a bandaid. It's a bandaid that may slow progress on the right solution, though. So it's worth asking whether this PR is in our best interests, long-term.

I won't stand in the way if folks want to merge this.

@johnnychen94
Copy link
Member

Adding such a bandaid can easily cause unexpected performance regression for some blas operations without enough specifications. The fallback blas implementations are usually quite slow and identifying the performance regression isn't usually a trivial task; I'd rather avoid this whack-a-mole patch.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants