-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rewrite for julia 0.7 #52
Conversation
Since this is going to be a breaking change, let me be more explicit about what's going on here. Prior to julia 0.7, it's approximately true to say that the only kind of Unfortunately these types came with some (a few-fold) performance hit; the reason for the benchmarks as part of the automatic tests in this package was to ensure that this hit didn't become too large, and in particular that some hard work I'd done on optimization when I first wrote this package didn't decay due to later changes. But I also had a second strategy: in some cases, one could safely use With JuliaLang/julia#23750, all of these considerations changed. Currently in julia 0.7, The consequences for this package are quite interesting. It no longer makes sense to have our own Since we no longer have specialized types, it makes much more sense for Aside from being busy with changes to Julia itself in preparation for 0.7/1.0, the main reason I haven't merged this yet is that unfortunately, the performance hit seems much worse: at least one of the benchmarks is now something like a 20x when formerly it was (IIRC) something like 3x. Moreover, it doesn't seem possible to avoid a performance hit because I am hopeful that before or around the time of the 1.0 release, that performance hit will be fixed. It's something I'm a bit concerned about, but I also firmly believe there are much bigger priorities right now. Still, I'll CC @Keno so he's aware of the considerations. |
I intend to have the performance fixed by the time 0.7 is released. The
performance was optimal at some point on my branch, but something clearly
regressed. Just busy with semantic changes at the moment.
On Dec 27, 2017 13:18, "Tim Holy" <[email protected]> wrote:
Since this is going to be a breaking change, let me be more explicit about
what's going on here.
Prior to julia 0.7, it's approximately true to say that the only kind of
AbstractArray that one could call reinterpret on was Array. The reason why
is that reinterpret was effectively a "pointer cast" operation like in
C---it just provided a different view on a contiguous block of memory. But
in Julia there's all kinds of cool stuff that can be done with a whole zoo
of AbstractArrays that are not Arrays and cannot be converted to one
without copying data. In order to support lazy operations on any
AbstractArray, this package defined new ColorView and ChannelView
AbstractArray types, which conceptually did the same work as reinterpret
but worked for any AbstractArray.
Unfortunately these types came with some (a few-fold) performance hit; the
reason for the benchmarks
<https://github.com/JuliaImages/ImageCore.jl/blob/master/test/benchmarks.jl>
as part of the automatic tests in this package was to ensure that this hit
didn't become too large, and in particular that some hard work I'd done on
optimization when I first wrote this package didn't decay due to later
changes. But I also had a second strategy: in some cases, one *could*
safely use reinterpret to perform this operation, and when applicable that
*didn't* come with a performance hit. The lowercase variants, colorview and
channelview, used reinterpret when they could (returning another Array) and
otherwise created ColorView and ChannelView wrappers.
With JuliaLang/julia#23750 <JuliaLang/julia#23750>,
all of these considerations changed. Currently in julia 0.7, reinterpret
*always* returns a ReinterpretArray, a lazy-wrapper that generalizes the
strategy I pursued with ColorView and ChannelView, making it possible to
call reinterpret on any AbstractArray. While greatly expanding the scope of
reinterpret, it simultaneously deprecated a previous capability of
reinterpret to control the shape (aka dimensions or size) of the resulting
object. (Now you call reshape(reinterpret(T, A), sz) to achieve the same
outcome.) IMO this is also a good thing, because it makes Julia's design
more orthogonal.
The consequences for this package are quite interesting. It no longer makes
sense to have our own ColorView and ChannelView types, instead we should
just use reinterpret, or rather reshape(reinterpret(...)) to also split
out/eliminate the color channel as a separate axis of the array. Moreover,
we can't (or perhaps, shouldn't) define our own specialization of
reinterpret that does the appropriate shaping automatically: any reinterpret
specialized on, say, AbstractArray{<:Colorant} would have to first call the
"plain" reinterpret on the same object, so that leads to a dispatch
problem. (We could solve it with invoke, and I'm open to changing this PR
to do so, but personally I don't favor this approach.) So instead this PR
defines a new function, reinterpretc, which combines reshape and reinterpret
appropriately for operations involving arrays of colors. The ColorView and
ChannelView types are deleted, but we still retain the colorview and
channelview lowercase variants as convenience operations. The way I've
implemented it, it's actually even more than "convenience": I've given them
some smarts, so that e.g., colorview(eltype(A), channelview(A)) will return
A itself rather than creating a wrapper-of-a-wrapper.
Since we no longer have specialized types, it makes much more sense for
similar to just create Arrays. This will fix #48
<#48>. Whew.
Aside from being busy with changes to Julia itself in preparation for
0.7/1.0, the main reason I haven't merged this yet is that unfortunately,
the performance hit seems much worse: at least one of the benchmarks is now
something like a 20x when formerly it was (IIRC) something like 3x.
Moreover, it doesn't seem possible to avoid a performance hit because
reinterpret always returns a ReinterpretArray. This was first reported in a
different context (JuliaLang/julia#25014
<JuliaLang/julia#25014>) but seems to affect this
package pretty severely.
I am hopeful that before or around the time of the 1.0 release, that
performance hit will be fixed. It's something I'm a bit concerned about,
but I also firmly believe there are much bigger priorities right now.
Still, I'll CC @Keno <https://github.com/keno> so he's aware of the
considerations.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#52 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ABO1l9r6jc2TGPb0LiX47Vs7b9E9rbrUks5tEjVsgaJpZM4QiiJN>
.
|
I've looked at this with v0.7-beta. I suggest a few fixes: in in !Base.root_module_exists(:StatsBase) may be replaced by :StatsBase ∉ map(x->Symbol(string(x)),values(Base.loaded_modules)) or "StatsBase [2913bbd2-ae8a-5f71-8c99-4fb6c76f3a91]" in Set((string(v) for v = keys(Base.loaded_modules))) (unless @StefanKarpinski has a better solution.)
With the above patches, (and after suppressing the tests of |
The strategy used here for ColorView and ChannelView has been implemented in a much more general way for ReinterpretArray. It makes sense to leverage the Base implementation. Plus, now that `reinterpret` is no longer restricted to `Array`, it turns out that we run into certain dispatch problems that stem from changing the dimensionality to handle/subsume color channels. The solution here is to introduce a `reinterpretc` that is specifically for handling colors. This definitely breaks old code.
OK, I've got the rest working locally. Will wait a bit for comments on the outstanding items above, but hopefully it will happen soon. |
This should be more performant, and now that we have a generic `reinterpret` we can cope with the consequences.
2916534
to
a167433
Compare
Now it's just a waiting game; most likely this will merge tomorrow morning. |
The final version of support for reinterpreting arrays with offset axes requires that the first axis starts with 1. Consequently we now reshape before we reinterpret. This led to a surprising number of cascading changes. Also fixes the Base._length deprecation.
a167433
to
20fb5b5
Compare
OSX and Windows failures are because the Julia version is too old (0.7.0-beta274 on OSX and 0.7.0-beta.275 or 266 on Windows, whereas I have this marked a minimum of 0.7.0-beta283; specifically JuliaLang/julia#28073 is missing). Given that my time is almost up (I'll be gone for two weeks), I'm going to merge and tag. |
The strategy used here for
ColorView
andChannelView
has been implemented in a much more general way forBase.ReinterpretArray
. It makes sense to leverage the Base implementation. Plus, now thatreinterpret
is no longer restricted toArray
, it turns out that we run into certain dispatch problems that stem from changing the dimensionality to handle/subsume color channels. The solution here is to introduce areinterpretc
that is specifically for handling colors. Consequently this is completely breaking.Not working fully yet, though lots of progress has been made.