Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix range bugs for matrix multiply/solve #13833

Merged
merged 1 commit into from
Nov 12, 2015

Conversation

artkuo
Copy link
Contributor

@artkuo artkuo commented Oct 30, 2015

This is a combined bug fix and workaround for ranges not behaving properly when pre-multiplied by a matrix. In the user forum an error was reported for A*r where A is a matrix and r is a range, which should just like a vector. The fix is to define matrix multiplication by a Range as including a collect first.

A second problem is that A\r can fail, with the workaround also being to do a collect on r. I call this a workaround as opposed to a bug fix, because the problem may come from intricacies of convert, and not necessarily a failure to collect, as A\r doesn't fail universally but only for some/most combinations of types for A and r. I didn't trace down far enough to figure that out, and someone more knowledgeable might know whether there is a deeper bug lurking somewhere. Nevertheless this PR seems to work and pass all tests including the added ones.

Finally, I left the definitions very broad here with AbstractArray. I would appreciate advice if these should be something more specific such as AbstractMatrix or such. My thinking was that the details on types get figured out by matmul.jl so there's no need to deal with it beforehand, but I'm not sure if this could potentially slow things down.

@@ -633,6 +633,10 @@ end
./(r::FloatRange, x::Real) = FloatRange(r.start/x, r.step/x, r.len, r.divisor)
./(r::LinSpace, x::Real) = LinSpace(r.start / x, r.stop / x, r.len, r.divisor)

# Matrix multiplication/solve on ranges, A*r should treat r as collected vector
*(A::AbstractArray, r::Range) = A*collect(r)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe just AbstractMatrix

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, will do.

@artkuo
Copy link
Contributor Author

artkuo commented Oct 30, 2015

Switched to AbstractMatrix (actually more verbosely with AbstractArray because AbstractMatrix isn't defined until later). There was a warning this was ambiguous a similar definition in bidiagonal, which was using AbstractVecOrMat. Looking around the other LinAlg code, I suspect bidiagonal should be using StridedVecOrMat, so I am provisionally changing that; it doesn't break any tests and gets rid of the ambiguity. (I hate to meddle with unrelated code and am unsure whether the ambiguity actually causes problems, but this makes bidiagonal consistent with the other structured matrices, which all define backslash solve with StridedVecOrMat.

Previous commit caused Travis to break. Not sure why, as it said the Mac environment failed, and I can't reproduce it on my Mac, which is where everything had passed tests to begin with.

@kshyatt kshyatt added the domain:linear algebra Linear algebra label Oct 30, 2015
@artkuo
Copy link
Contributor Author

artkuo commented Nov 1, 2015

It turns out I had a poor choice of matrix for unit test; changed to something simpler and now everything works. Rebased for clean commit.

@andreasnoack
Copy link
Member

I think the deeper problems are different for * and \. For *, the problem is that the intermediate method LinAlg.gemv! is restricted to StridedVector even though generic_matvecmul! is not. For \, the issue might be that the return array is not initiated with similar so it might not be mutable. I'd be okay with merging the current pr and revisit later, but if you feel for it, you can also try to look into fixing the ssues mentioned above.

@artkuo
Copy link
Contributor Author

artkuo commented Nov 2, 2015

It seems reasonable for LinAlg.gemv! to restrict to StridedVector, because the method is hoping for strides, even if it can drop down to AbstractVector. So my interpretation is there may be a method missing from A_mul_B! that properly handles the abstract case. As for \, the issue does indeed seem to be with mutability, where it uses convert which doesn't necessarily do a similar.

Meanwhile, I don't think there is a rush to fix this, because the person who reported it was complaining about having to do collect and therefore has already worked around it. I hope to report back in a day or two.

@andreasnoack
Copy link
Member

@artkuo Any news on a general fix of this? If not, I think we can just merge this for now.

@artkuo
Copy link
Contributor Author

artkuo commented Nov 10, 2015

I am close to having a more correct fix, but it has become complicated. As you predicted, both multiply and divide have deeper issues. These are, however, somewhat delicate to deal with. There are cases where similar or copy are used with the intention of writing to the output (e.g. for BLAS), but both sometimes return an immutable, for example for range or structured matrix input. I have fixes for this, but I fear they may be controversial. For example, I want to change similar so that it acts consistently across structured matrices (it currently doesn't, nor is that fact documented or tested #13731). I also want to introduce a mutablecopy which acts like copy except always returns something that can be written to. Both of these can potentially be controversial, as I don't know some of the deeper intentions and history behind things.

Also, one of the problems is missing fall-back for A_mul_B! for abstract vector and matrices. That would be easy to fix, except for an issue where methods don't dispatch properly, traceable to a bug dispatching unions like AbstractVecOrMat #5384. The solution is to unroll the unions where they cause conflicts (usually in structured matrices), and replace with separate method calls. This unfortunately messes with someone's working code, not because it is wrong per se, but to work around that dispatch issue.

My ultimate fix is something of an omnibus. It deals with a bunch of these issues at once, and I need to write some more tests against side-effects like accidentally messing up sparse methods. I have been toying with breaking it into multiple, bite-size PRs, but it's a bit complicated because some things overlap, so I can't necessarily find an easy way to make the commits fully independent. Omnibus is easier for me to put together, but almost certainly harder for the community to digest, and at greater risk for rejection.

It may be sensible to merge the present PR, with the understanding that it's temporary. It's actually not a terrible fix, but it leaves some other bugs unaddressed. If/when the "real" fix is performed, then the collect here will be reverted. Please feel free to merge if you think that's okay. I'd also appreciate any advice on how best to deal with the omnibus.

@tkelman
Copy link
Contributor

tkelman commented Nov 10, 2015

I have been toying with breaking it into multiple, bite-size PRs

Please do this. If you need design guidance on how to split things up, perhaps a new issue with links to the prospective branch would be best so you can get feedback by a means other than opening an overly-large PR.

@artkuo
Copy link
Contributor Author

artkuo commented Nov 11, 2015

Okay I'll start with a small one on similar and see how it goes. I'll try to do this in serial order and not cause too much confusion. This may take a while, so you may wish to merge the present PR as a temporary band-aid.

@andreasnoack
Copy link
Member

Merging this for now as it fixes the problem. Looking forward to the more general fix.

andreasnoack added a commit that referenced this pull request Nov 12, 2015
fix range bugs for matrix multiply/solve
@andreasnoack andreasnoack merged commit 0d2ea5c into JuliaLang:master Nov 12, 2015
@artkuo artkuo deleted the rangemulsol branch November 12, 2015 19:54
@ScottPJones
Copy link
Contributor

This seems to cause a problem when building Julia v0.5:

WARNING: New definition 
    \(Base.LinAlg.Diagonal, AbstractArray{T<:Any, 1}) at linalg/diagonal.jl:170
is ambiguous with: 
    \(AbstractArray{#T<:Any, 2}, Base.Range) at range.jl:652.
To fix, define 
    \(Base.LinAlg.Diagonal{#T<:Any}, Base.Range)
before the new definition.

@artkuo artkuo restored the rangemulsol branch November 13, 2015 00:12
@artkuo
Copy link
Contributor Author

artkuo commented Nov 13, 2015

I wasn't sure to make of this ambiguity, because the evaluation D\r still works correctly. I could eliminate the warning by defining a new method in diagonal.jl for Diagonal \ Range. I didn't like that solution because it looked ugly to have Range appear in an otherwise clean file for Diagonal. In any case, this is a temporary fix, so I'm not sure which is worse. Live with the annoying warning, or add an annoying, seemingly mis-placed line to get rid of it? Or this PR could be rolled back, as the bug is relatively minor, and I hope to supply a more correct fix soon.

@andreasnoack
Copy link
Member

There should be time left in this release cycle to find a more general fix and the users who complained about the range issue should and probably will stay on 0.4 meanwhile. Therefore, I think we should just revert this commit and wait for your more general fix.

@artkuo
Copy link
Contributor Author

artkuo commented Nov 13, 2015

Sounds good, the fix for \ is actually easier than for *, so I'll get to \ first.

@artkuo artkuo deleted the rangemulsol branch November 13, 2015 02:00
@andreasnoack
Copy link
Member

Reverted in 7cda13f

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants