Make LinSpace generic #18777

timholy · 2016-10-03T16:25:43Z

This is my proposed alternative to #18492 for fixing the issues with LinSpace. The main attractions of this PR are:

closes linspace(...)[1] can differ from linspace start value #14420
you can now use linspace with any type T for which (1-t)*a::T + t*b::T makes sense.

Given the large amount of work & discussion that has gone into linspace and friends with respect to numerical precision (at least #2333, #9637, #18492), it would be remiss not to comment (extensively 😉) on how this does and does not address previous concerns or implementations. Let me address the issues one-by-one.

Mind-reading

First and foremost, this implementation does not try to read your mind, whereas the implementation currently in master does. The easiest way to see what I mean is to compare master's behavior:

julia> r = linspace(0.0, 0.3, 4)
4-element LinSpace{Float64}:
 0.0,0.1,0.2,0.3

julia> r[2]
0.1

with the behavior of this PR:

julia> r = linspace(0, 0.3, 4)
4-element LinSpace{Float64}:
 0.0,0.1,0.2,0.3

julia> r[2]
0.09999999999999999

Note that, given the specified endpoints, that's the correct answer:

julia> 0.3/3
0.09999999999999999

julia> Float64(BigFloat(0.3)/3)
0.09999999999999999

The version in master returns 0.1 essentially because it interprets the right endpoint 0.3 as 3//10. This is what I refer to as mind-reading, and while it has some undeniable attractions, it's (1) only applicable to special types (e.g., Float32/64), (2) fairly fragile (it's the origin of #14420), and (3) arguably wrong even in principle (the user said 0.3, not 3//10).

Note that, now, people who want mind reading can be told to use Rationals directly:

julia> r = linspace(0, 3//10, 4)
4-element LinSpace{Rational{Int64}}:
 0//1,1//10,1//5,3//10

julia> r[2]
1//10

Precision

All that said, there's still a valid concern about precision: linear interpolation is famously subject to roundoff error. This implementation returns results with full numeric precision by using two calls to fma. The key function (which other types can specialize) is:

# High-precision interpolation. Accurate for t ∈ [0.5,1], so that 1-t is exact.
function lerpi{T}(j::Integer, d::Integer, a::T, b::T)
    t = j/d
    # computes (1-t)*a + t*b
    T(fma(t, b, fma(-t, a, a)))
end

As long as j >= d/2, it turns out you don't need to compute t to higher precision for this to return a numerically-precise result (as long as you use fma).

All the (new) tests compare the result against what happens when you take the two endpoints, convert them to BigFloats, perform all interpolation at BigFloat precision, and then cast back to the element type. That's what I mean by "full precision" results---there is no roundoff error to within the precision of the numerical type, as determined by the inputs upon construction.

Test changes and deletions

First, I have to say that whoever wrote the old tests (mostly Stefan?) did an amazing job. While my final implementation is dirt-simple, I confess it was not the first (or second, or third...) thing I tried; it was very demanding to find an implementation that could pass the "good" ones (spelled out below). The extensive tests saved me from many implementations that seemed equivalent but ran into trouble in corner cases. 💯

That said, anyone who looks at this PR will see many test changes and a few deletions, because the new implementation does not pass many of our old tests. For the vast majority of these I am unconcerned and changed them guilt-free, because most of these turned out to rely on mind-reading: in particular, they don't pass the "convert endpoints to BigFloat before performing operations" test. Even the so-called "additive identity" tests (e.g., (r - 1) + 1) fall in that category when combined with collect, due to the non-associativity of floating-point addition. (The old tests happened to pass for the provided inputs, but it doesn't take long at all to discover that tweaking the endpoints in those tests, in just about any way, breaks them. So we were testing for an "identity" that doesn't actually hold in general.)

However, a few of the test deletions make me sad or are worthy of public confession:

Behavior at u = eps(0.0): most tests pass, but ones that specifically test for generation of -0.0 don't. I'm basically unbothered by this---we have several concerns about -0.0 anyway (e.g., Should zero(::Float) = -0.0? #18341). If you think of it as a bug, it's probably best thought of as an fma bug, not a problem with this implementation of linspace.
linspace(-3u,3u,7) doesn't pass. I'd rather this one passed, because it's seemingly unrelated to nonsense like -0.0. I suspect this is an fma bug, so I kept the test and marked it @test_skip. Note this test passes if u = 2*eps(0.0), so the failure is specific to the smallest-magnitude floating point number.

Performance

I'm hoping that on machines with CPU-support for fma, this will be competitive with our current implementation. fma support was introduced around 2012-2013, so there many be a good number of machines that don't have such support. Without it, this ends up ccalling C code for the fma and is therefore glacially slow.

If we are concerned about performance on older machines, one potential way to fix it would be to Dekker-split the two endpoints upon construction; then you'd only have to split t at runtime, and perhaps along with some tweaks that take advantage of the symmetry of the problem you might be able to largely close the gap.

We might also consider adding a @fastmath specialization.

But before getting too speculative about performance, let's see what @nanosoldier runbenchmarks(ALL, vs=:master) thinks.

Future changes to Ranges

One source of concern is that LinSpace and FloatRange now have different behavior. I think the right way to handle this is to cause start:step:stop to return a LinSpace when any of those entries are AbstractFloats, and move FloatRange to a package. That way there won't be any inconsistencies. StepRange can still exist, only now (with this more generic LinSpace) I think it can safely be confined to Integer (thus fixing all kinds of problems like those I'm attempting to address in #18744).

timholy · 2016-10-03T16:30:57Z

Oops, put my quotes in the wrong place. @nanosoldier runbenchmarks(ALL, vs=:master).

yuyichao · 2016-10-03T16:33:31Z

I think nanosoldier runs on Haswell hardware. Does it compile julia with a generic arch that disables FMA or does it compile with native arch? @jrevels

timholy · 2016-10-03T16:38:00Z

Argh, a last-minute untested tweak introduced a build bug. I'll re-launch nanosoldier once CI passes.

jrevels · 2016-10-03T16:48:01Z

No, it doesn't disable FMA AFAIK. I'm assuming that would be an option/flag I'd have to enable before running make (which is all nanosoldier does).

Also, for when you relaunch: runbenchmarks(ALL, vs=:master) --> runbenchmarks(ALL, vs=":master") Nanosoldier picked up your other submission, but rejected it because the vs argument wasn't valid. I don't know why it didn't throw an invalid job error, though - it should have...

musm · 2016-10-03T17:19:31Z

base/range.jl

+    @_inline_meta
+    t = j/d
+    # computes (1-t)*a + t*b
+    T(fma(t, b, fma(-t, a, a)))


I'd put T(j/d) and remove the T here.

That will fail if a, b = rand(5), rand(5).

To address the "confused": the point being this is the fallback implementation, so should support any type that implements fma. T might not be a number. Or, T might be a number type like Fixed{Int8,7} (from FixedPointNumbers) which can't represent 1. Linear interpolation between two valid numbers is still perfectly fine in such circumstances, though, so there's no need to fail. That's why I'm not going to take your suggestion 😄.

musm · 2016-10-03T17:25:04Z

Unfortunately non hardware fma is about 40 times slower on my machine. Also, anyone who downloads the windows binaries will also not have access to hardware fma, unless they manually build a system image, even if their cpu supports it.

timholy · 2016-10-03T17:29:44Z

Yeah, it's really bad without the hardware acceleration. I suspect the approach I outlined would make things hugely better, but it's hard to know what the final result would be.

Let's see whether enough people like the overall direction, though, before deciding to tackle the performance question.

timholy · 2016-10-03T17:31:41Z

Though cross-linking to JuliaMath/DoubleDouble.jl#18 in case anyone gets excited about tackling the performance question now 😄.

timholy · 2016-10-03T19:25:35Z

Most of this is passing, so let's @nanosoldier runbenchmarks(ALL, vs=":master"). (Thanks @jrevels, I was going from memory but I must be out of practice!)

Looks like we have a problem on 686 in the nearly-overflowing-numbers tests. Why would overflow result in an InexactError, when on x86_64 it just gives you NaN? Perhaps this is an fma bug?

StefanKarpinski · 2016-10-03T21:22:57Z

That was indeed me who wrote the linspace tests (and float ranges tests). I'm glad they were useful. It's basically impossible to get this stuff working without a very thorough test suite – including some randomly generated test cases, since you can't (or at least I couldn't) think of all the nasty things that can actually happen. I agree that the cases that are "broken" by this PR are marginal enough to ignore – and maybe we can fix fma somehow to avoid them.

I'm sold on this change except the last part about a:s:b syntax returning a LinSpace for floating point. Occasionally, when someone writes a:s:b, we have the pleasant situation that (b-a)/s is exactly an integer, n, and a + n*s == b, also exactly. In that case, we could indeed return linspace(a,b,n). But that's an excruciatingly rare case. Often (b-a)/s is not an integer, and even when it is, sometimes a + n*s is not exactly equal to b. In such cases, there are two possibilities:

The user has some real numbers A, B and S in mind such that (B-A)/S is exactly an integer, n, and therefore, A + n*S = B, exactly, and convert(T,A) == a, convert(T,B) == b and convert(T,S) == s (using Julia syntax as if real numbers were somehow Julia values).
The user has no such real numbers in mind and simply intends for you to do the naive thing and produce the values a, a + s, a + 2s, ..., a + n*s for n maximal such that a + n*s ≤ b.

The "no guessing" version of a:s:b is to ignore the former case and do the naive thing all the time – i.e. return a + (i-1)*s for the ith element of a:s:b. We used to do that. TL;DR: that behavior is so broken that we would be better served to simply remove a:s:b for floating point numbers altogether. An example, in case it's not clear:

julia> showall(collect(0:6)*0.1)
[0.0,0.1,0.2,0.30000000000000004,0.4,0.5,0.6000000000000001]

With naive behavior collect(0:0.1:0.6) would be [0.0,0.1,0.2,0.30000000000000004,0.4,0.5] – the collection is too short by a value (0.6000000000000001 gets excluded because it's too big) and has an "off" value in the middle – 0.30000000000000004. There's an endless supply of these broken cases.

The only other option is to guess at values of A, B and N, and by implication S = (B-A)/N, such that

convert(T,A) == a
convert(T,B) == b
convert(T,(B-A)/N) == s

That's what my "mind reading" code does – it finds rational values that project down onto a, b and s when converted to floating point. If it doesn't find any such values, it gives up and does the naive a + (i-1)*b iteration – which is sometimes exactly what the user intended.

People are uneasy about this "guessing" – I suspect, largely because they don't understand what or why it's doing what it's doing, not because it's doing something they don't like. (There haven't been any complaints about a:s:b not doing what someone wanted in a long while.) So let's consider what the potential failure modes of this behavior are:

The user actually wanted naive behavior and accidentally chose a, b and s such that lifting just happened to work. No one has ever reported this. I would be shocked (shocked, I tell you!) if anyone ever encountered such a case.
The user has some A, B and N in mind that are too complicated for the algorithm to figure out. This does happen sometimes, but it's really rare. The current algorithm figures out any reasonable rational-valued endpoints, so the cases that it misses tend to be ones where a rational range is shifted by a random amount – i.e. they wrote o+a:s:o+b where o is not a nice even rational value but a, b and s are. In these cases, the user is best advised to use linspace or to rewrite the range as o+(a:s:b).
The user had a different A, B and N in mind than the ones the algorithm figured out. This is plausible, but again has never actually been reported. I haven't been able to construct a plausible case of this either. Keep in mind, that for many such cases there's no discernible difference since the set of values yielded by the resulting range object are the same.

The current algorithm, although it's technically guessing, is justified in doing so – because some amount of guesswork is inherent in the a:s:b syntax. Matlab also guesses, for what it's worth – it's just a lot less principled about how it guesses. So if we're going to change the behavior of a:s:b to something else, then the only viable option in my view is to remove it for floats entirely.

nanosoldier · 2016-10-03T22:02:02Z

Your benchmark job has completed - possible performance regressions were detected. A full report can be found here. cc @jrevels

timholy · 2016-10-04T01:34:28Z

OK, I had temporarily forgotten about cases like 1:0.5:3.75, and with that one example I agree it's a no-brainer that something like FloatRange/StepRange is here to stay---indeed, they're perfectly aligned with the maxim I'm promoting here, which is to respect the user's parametrization (which is in terms of start, step, and stop rather than start, stop, length).

So I'd amend my statement about x:y:z returning a LinSpace (you're absolutely right, it definitely should not), but perhaps range(start, step, len) should? I tend to think that when in conflict, length is more "primary" than step? And in this case the two are commensurate, up to roundoff error this API does not permit disagreement among the parameters.

Along these lines: I do suspect something in the FloatRange/StepRange part of the world will also have to change. Right now, StepRange is "mostly" generic (the exception being that you can't use it for AbstractFloat) whereas FloatRange is specific to AbstractFloat. For doing stuff like making ranges of Unitful quantities, that means we have to use StepRange. But because the "barrier" against floating point was placed for a good reason, we have the following situation (aka #18744):

julia> using Unitful

julia> const mm = u"mm"
mm

julia> r = (1:800)*(0.8mm);

julia> typeof(r)
Array{Quantity{Float64, Dimensions:{𝐋}, Units:{mm}},1}

julia> last(r)
640.0 mm

julia> r = 0.8mm:0.8mm:(800*0.8mm);

julia> typeof(r)
StepRange{Quantity{Float64, Dimensions:{𝐋}, Units:{mm}},Quantity{Float64, Dimensions:{𝐋}, Units:{mm}}}

julia> last(r)
640.0 mm

julia> length(r)
799                          # <--- from the above you know it's lying here, really there are 800 elements

julia> rc = convert(Array, r);

julia> typeof(rc)
Array{Quantity{Float64, Dimensions:{𝐋}, Units:{mm}},1}

julia> length(rc)
799

julia> last(rc)
639.1999999999979 mm    # but now there are not

So we have this situation where the last element isn't a reliable guide to the length (computed from div(r.stop+r.step - r.start, r.step)). Moreover, stop gets "recomputed" from the user's input so that it becomes an integral multiple of the length. Consequently one might specify a range with what appears to be a safe API and then witness it truncated downward:

julia> r1 = 0.0mm:0.1mm:0.9mm
0.0 mm:0.1 mm:0.8 mm

julia> r2 = range(0.0mm, 0.1mm, 10) # hmm, OK, try specifying 10 elements
0.0 mm:0.1 mm:0.8 mm     # <---- it gives me 9

julia> r1 == r2
true

# And this is just lovely:
julia> length(r2)    # ... and claims I have only 8
8

julia> rc = convert(Array, r2)   # ...which is all it keeps here
8-element Array{Quantity{Float64, Dimensions:{𝐋}, Units:{mm}},1}:
 0.0 mm
 0.1 mm
 0.2 mm
 0.3 mm
 0.4 mm
 0.5 mm
 0.6 mm
 0.7 mm

so now we're down by two elements from where we intended to start.

As an argument that FloatRange is more robust than StepRange, this makes a pretty compelling case. But its restriction to AbstractFloat is problematic: really, you'd like for its logic to be used for any type subject to problems from rounding, but thanks to Julia's wonderful polymorphism that's very hard to encapsulate with a requirement like T <: AbstractFloat. (To solve this well, we probably need ArithmeticExact and ArithmeticRounds traits, aka IntegerLike and FloatLike.)

Overall, I think we're very much on the same page, it's just a question of figuring out what to do about the "step" ranges. It occurs to me that one possible solution would be to unify them into a single type which preserve the user's original inputs in addition to whatever computations we perform twiddling the parameters. That would seem to be a way to help ensure that length always returns the same thing it did in the beginning, because it will be based on what was used to construct the range rather than some quantity which might have been modified by roundoff error.

timholy · 2016-10-04T01:54:30Z

Alright, so it seems this has a future. In addition to the general approval, I notice that the nanosoldier run worked out fairly well, with most changes being improvements and the worst regressions being ~1.5x.

So I think this moves into "what do we need to do to finish this off?" mode. Here are items I see:

decide whether we want range(start, len) and/or range(start, step, len) to return a LinSpace.
figure out what's up with the 686 architecture failure. fma bugs are presumably openlibm bugs?
figure out how to solve the terrible performance when you don't have hardware emulation. Possibilities:
- say "sorry, them's the breaks---consider buying a newer computer"
- when hardware support is unavailable, use lower accuracy
- be accurate/slow by default, but give people the option to accept lower accuracy via new extensions to @fastmath
- (EDITed): come up with a faster way of implementing fma in software. I think this almost requires that linspace(0.0, 0.3, 4) will (on such platforms) return a SplitFloat{T} which has been pre-processed so that most of this logic is done at construction time rather than evaluation time (at construction time the endpoints are known, and it's also known that the scalar factor will be between 0.5 and 1, so we have several advantages). Again, this would ideally be done for any element type subject to roundoff error, but since linspace has historically only supported AbstractFloat I suspect we could (if necessary) temporarily get away with just doing this for such types.

Of those latter options, I kind of think we want both of the last items, and that at least one must be in place before we can merge this.

tkelman · 2016-10-04T09:14:55Z

It's not just windows binaries that are built without fma support. Mac binaries are currently built for core2 (the earliest Intel macs), i686 Linux and Windows binaries are built for pentium4, and x86_64 Linux and Windows binaries are built for x86-64 (though requiring cmpxchg16b recently, not without causing problems - #18701). Building binaries that don't start because you don't have recent hardware isn't very nice. Retuning for the native hardware after install time might work, but is a bit time consuming and requires some toolchain components that aren't necessarily always present.

We can start trying to do what openblas does and pre-build multiple copies of the system image for different instruction set generations and switching at runtime, but that requires some infrastructure that we currently don't have (see #14995 for a not totally general prototype). It also doesn't help with all of the C libraries other than openblas (e.g. openlibm) that don't implement their own optimized kernel runtime dispatch.

tkelman · 2016-10-04T09:23:47Z

There was a somewhat regrettable discussion on hacker news a few days back, the conclusion of which (from most onlookers) was that we probably don't need to export a range function since it's mostly redundant and can be confusing since different languages choose different conventions for the inputs.

tkelman · 2016-10-04T09:35:31Z

base/range.jl

 end

 """
-    linspace(start::Real, stop::Real, n::Real=50)
+    linspace(start, stop, n=50)


~~signature needs adjusting in rst too~~

~~(possibly unless the manual conversion gets merged before this - not sure)~~ it did

simonbyrne · 2016-10-04T10:51:34Z

I guess it would be good to write down exactly what we want. From what I understand, the intention is that x = linspace(a,b,n) should give a vector-ish object (should it be a Range or AbstractVector?) such that:

length(x) == n
x[1] == a
x[n] == b
for integer 1 <= i <= n, x[i] should be equal to the correctly rounded (a*(n-i) + b*(i-1))/(n-1) under the relevant type.

Points 1-3 seem reasonable. 4 is trickier (at least for floating point), and this PR doesn't quite provide that, e.g.

function lerpi{T}(j::Integer, d::Integer, a::T, b::T)
    t = j/d
    # computes (1-t)*a + t*b
    T(fma(t, b, fma(-t, a, a)))
end

function lerpi_big(j,d,a,b)
    ba = big(a); bb = big(b); bj = big(j); bd = big(d)
    oftype(a,(ba*(bd-bj) + bb*bj)/bd)
end

a = 2.0693042755264957
b = 0.18522296289341592
j = 6
d = 11

julia> lerpi(j,d,a,b)
1.041623559544816

julia> lerpi_big(j,d,a,b)
1.0416235595448158

I'm not sure easy it would be to actually implement. Alternatively, we could weaken condition 4, in that case it might be possible to dispense with the fmas.

simonbyrne · 2016-10-04T13:56:40Z

base/range.jl

+# This is separate to make it useful even when running with --check-bounds=yes
+function unsafe_getindex(r::LinSpace, i::Integer)
+    d = r.lendiv
+    j, a, b = ifelse(2i >= length(r), (i-1, r.start, r.stop), (length(r)-i, r.stop, r.start))


I'm pretty sure you want the branch the other way, as you want t = j/d < 0.5 (otherwise the inner fma buys you nothing, since 1-t is exact).

simonbyrne · 2016-10-04T14:34:08Z

I've tried out a couple of different strategies here: https://gist.github.com/simonbyrne/45c5924940746d628e94de38a78ae37c

The tldr is that with one small modification, this is the best approach; the next best is a fairly simple one.

Now all these methods fail catastrophically if a ≈ -b and j ≈ d/2, in which case we would need a better approach.

StefanKarpinski · 2016-10-04T20:44:55Z

Out of curiosity, where does the name lerpi come from?

timholy · 2016-10-04T22:06:04Z

@simonbyrne, really glad to see you chiming in here. Thanks for the help. I haven't had time to tackle this yet today, but I will check this out carefully ASAP.

@StefanKarpinski, lerp is somewhat standard in certain fields, and because I was specifying the point using a pair of integers I added the i. No better reason than that. (I thought about lerpr for Rational, but we're not actually passing a Rational which is why I went with i.)

I'm not remotely wedded to the name, alternatives are welcome.

timholy · 2016-10-05T08:26:12Z

Again, thanks @simonbyrne for getting involved. I was sort of hoping you might find this interesting, and was looking forward to your input.

So, first, you're absolutely right that I misspoke above, in two ways:

the answer is not precise to all digits. I should have said the answer is precise to within an absolute tolerance Δ that "matters" (I'll define this below)
even that is not quite right, because my test used <=Δ, and even for the revised statement to be true it should have been <Δ. Oops.

OK, so what tolerance matters? You measured the precision of the answer against itself. My tests measured the precision of the answer against the endpoints, using Δ = max(eps(a), eps(b)). Both criteria matter, and I didn't even think to measure yours. However, I would argue that using the "endpoints" is perhaps the more relevant of the two: the endpoints can't be specified any more precisely than Δ, even if the user wished it were so. So Δ determines the length scale that matters for the answer, too.

Using a slight modification of your test script, here are the answers I get:

function  ans_worst  ans_median      end_worst   end_median
lerp0       2.131      0.351           2.131       0.224
lerp1       1.809      0.333           1.702       0.212
lerp2       2.193      0.352           2.083       0.224
lerp3       1.394      0.273           1.172       0.172
lerp4     132.738      0.292           1.098       0.193
lerp5       2.174      0.291           1.840       0.189

The ans results are using your criterion (the precision of the answer measured against itself), the end results are using my criterion (the precision of the answer relative to Δ).

The most interesting case is naturally lerp4, the one I proposed in this PR. As you noticed (and I didn't), it has enormous worst-case "self" (ans) error. However, it's the best when it comes to "endpoint" error, and very close to 1 which (mostly) explains why my <= Δ criterion passed (your tests presumably hit some values that our linspace tests do not).

The reason I suspect lerp4 exhibits the best "endpoint" error is that with t >= 0.5, t + (1-t) is always exactly equal to 1.0, whereas for t < 0.5 that true only up to some roundoff error. Consequently, in the t >= 0.5 case you're doing "true" linear interpolation (perhaps at a slightly different point than you intended), whereas in the t < 0.5 case you're not always performing true "interpolation."

Now, interestingly, this reveals that lerp4 is not the best in terms of median behavior; that honor indeed goes to lerp3. So in fact I think lerp3 could be the better choice; I'd say the choice between the two is not entirely straightforward.

It is interesting, though, how much better worst-case behavior the two fma algorithms have (almost one Δ-unit) compared to "regular" arithmetic.

timholy · 2016-10-05T08:47:28Z

I added two more to the mix, lerp6 and lerp7 posted in that gist. lerp7 uses DoubleDouble for its computations, and lerp6 computes a correction Δt which also uses the same technique (but performs all other computations using normal precision). Results:

lerp6       1.474      0.276           1.119       0.179
lerp7       0.500      0.250           0.500       0.154

lerp7 is (unsurprisingly) the best of the bunch, but I'm not sure I understand why it shows any error.

simonbyrne · 2016-10-05T11:16:07Z

lerp7 is (unsurprisingly) the best of the bunch, but I'm not sure I understand why it shows any error.

I'm comparing the error to the result computed in BigFloat precision, so you would expect the minimum error to be 0.5 ulps (when the BigFloat answer is halfway between 2 floating point numbers).

simonbyrne · 2016-10-05T11:49:39Z

Here's how one could do it without DoubleDouble.jl:
https://gist.github.com/simonbyrne/45c5924940746d628e94de38a78ae37c#file-lerptest-jl-L41-L66

Which seems to be correctly rounded (i.e. accurate to within 0.5 ulp): the other nice feature is that the division and fma can be done at the creation of the LinSpace object. I'll do a bit more tuning and compare the performance.

timholy · 2016-10-05T12:37:41Z

Ah, yes, I just figured that out too. If you round the result to Float64 before doing the comparison, here's my current status:

function  ans_worst  ans_mean   ans_median      end_worst   end_mean   end_median
lerp0       2.000      0.337      0.000           2.000       0.233      0.000
lerp1       2.000      0.309      0.000           2.000       0.215      0.000
lerp2       2.000      0.338      0.000           2.000       0.234      0.000
lerp3       1.000      0.199      0.000           1.000       0.140      0.000
lerp4      83.000      0.250      0.000           1.000       0.131      0.000
lerp5       2.000      0.222      0.000           2.000       0.141      0.000
lerp6       1.000      0.194      0.000           1.000       0.119      0.000
lerp7       1.000      0.000      0.000           1.000       0.000      0.000
lerp8       1.000      0.001      0.000           1.000       0.000      0.000

lerp8 is very impressive, and blows my feeble attempt at something similar with lerp6 out of the water 😄.

My new laptop had a motherboard problem and is out for repair, so I don't have easy access to a machine with hardware fma support. Assuming you do, have you benchmarked these to see how they perform? EDIT: oh, I see you're working on that. The fact that you only need fma on the length (i.e., can be done at construction time) is awesome.

timholy · 2017-01-26T19:52:44Z

Some kind of rounding is needed, but one could probably get sensible behavior "cheaply" like this:

function colon(a, s, b)
    len_est = (b-a)/s + 1
    StepRangeLen(a, s, floor(Int, len_est+10*eps(len_est)))
end

This doesn't make the returned values "pretty," of course. I just think that @andreasnoack is saying that, to him, -1.11022e-16 is just as beautiful as 0.

timholy · 2017-01-26T19:55:44Z

AppVeyor failure is unrelated. Glad to see this emerge from the other side!

StefanKarpinski · 2017-01-26T20:15:19Z

I don't see how getting within 10 ulps (or any other simplistic criterion) is actually better that what we do now – it's just a dumber way to not take the user's input literally, and will often result in a step and/or endpoint which do not round to the given floating-point values. That's the opposite of taking the given inputs at face value.

I do, however, think that something more comprehensive and principled might be possible. In particular, the main objection to "guessing" can only be that there might be multiple, observably different interpretations of the user's input. In other words for a given a:s:b for a, s, b floating-point values, there's a set of true real values α, σ, β that round to those floating-point values, a subset of which have n = (β - α)/σ an integer and a + n*σ == β exactly. Currently, we're searching for smallish rational α, σ, β such that this is true and giving up if we can't find such. We could instead consider all real values α, σ, β and choose the best in some general sense, keeping in mind that if all admissible α, σ, β result in the same observable values, any choice is fine.

timholy · 2017-01-26T20:55:12Z

Sure. Let me clarify that I'm fine with having it stay like it is now (pending real-world feedback). But also keep in mind that even now there's no guarantee that a:s:b hits b exactly, for example 1:1:3.5. (Hitting both endpoints exactly seems to be the job of LinSpace.) One could take that a license to mess up more broadly 😄.

I think an easier simplification might be to consider ditching StepRange in favor of StepRangeLen. But not for 0.6.

StefanKarpinski · 2017-01-26T21:58:14Z

Right, there's the complication that the user's intention might not be to hit b exactly – the syntax is overloaded like that, which complicates things further. This corresponds to the case where no possible choices of α, σ, β is admissible. If anything, we currently don't search hard enough for possible real triples and conclude that some ranges are inexact when they aren't actually. E.g.:

julia> d = rand()
0.3359637381398932

julia> length(d+0.1:0.1:d+0.9), length(0.1:0.1:0.9)
(8,9)

julia> mean((d = rand(); length(d+0.1:0.1:d+0.9) != 9) for _=1:10^7)
0.4471468

This fails for about 45% of the time but it should succeed every time if we were searching thoroughly. Of course it's possible that some of these are actually impossible to satisfy since d+0.1 and d+0.9 as floating-point are not the same as d+1/10 and d+9/10 as reals.

tkelman · 2017-01-27T01:12:43Z

base/range.jl

-/(r::FloatRange, x::Real)   = FloatRange(r.start/x, r.step/x, r.len, r.divisor)
-/(r::LinSpace, x::Real)     = LinSpace(r.start / x, r.stop / x, r.len, r.divisor)
+# For #18336 we need to prevent promotion of the step type:
+(x::Number, r::AbstractUnitRange) = range(x + first(r), step(r), length(r))


I must've not noticed these before, but assuming it's intentional that you've widened these from Real to Number? Should probably be tested if we're sure we want to keep that.

I'm almost certain there's a test something like (1:3)+im, although I don't know whether the output type is tested.

But yes, there's no particular reason to restrict to Real anymore.

andreasnoack · 2017-01-27T03:50:39Z

Are you advocating for giving up on a:s:b syntax for ranges, @andreasnoack?

Yes.

I just think that @andreasnoack is saying that, to him, -1.11022e-16 is just as beautiful as 0.

beautiful and floating point numbers usually don't go together, but yes, I don't think you should care if it is -1.11022e-16 or 0.0.

StefanKarpinski · 2017-01-27T21:04:14Z

Empirically, people do and we don't get bug reports about this sort of thing anymore 😬

timholy · 2017-01-27T21:30:16Z

I've noticed that, 👍. This suggests that even though many of us are happy to tolerate ulp-level differences, expunging them is a good thing. Fewer bug reports (less email) is priceless.

Will be interesting to see if someone comes up with a simpler approach, but for now we may have to live with surprisingly-complex code for ranges. Even if it's complex, getting both generality AND better accuracy out of this most recent change suggests we're doing quite a lot right.

ajkeller34 · 2017-01-29T02:40:57Z

Amazing work here. I am very excited about ranges supporting generic types!

I just noticed, but is this method for _colon actually reachable? This method for colon (which is the only one that calls _colon, afaict) is restricted to T<:Real, for which TypeOrder(T) is always HasOrder(). Perhaps it is just a stub for if/when these traits are meant to be adopted by types outside of Base? I could implement traits for Unitful quantities but since physical quantities are not <:Real I may still need to write my own methods for colon and possibly range, correct?

The most important thing is that the range types themselves now work with generic types, so I'm very excited about that. I just want to make sure I'm implementing things sensibly and in line with any ideas @timholy has about how this should all fit together.

timholy · 2017-01-29T10:18:37Z

Right, I suspect you should write your own colon and range methods, rather than extending the internal ones. I agree that method looks unnecessary for the reasons you state.

Make trunc, round, floor, ceil testing more precise

KernelDensity.jl rolled its own range generator which suffers from similar flaws as Base's linspace prior to being fixed in JuliaLang/julia#18777. This closes JuliaStats#39.

KernelDensity.jl rolled its own range generator which suffers from similar flaws as Base's linspace prior to being fixed in JuliaLang/julia#18777. This closes #39.

timholy force-pushed the teh/linspace branch from 326223d to 5143ab4 Compare October 3, 2016 16:36

timholy force-pushed the teh/linspace branch from 5143ab4 to 6e3c132 Compare October 3, 2016 16:57

timholy mentioned this pull request Oct 3, 2016

Add a few more constructors/conversions JuliaMath/DoubleDouble.jl#18

Merged

musm reviewed Oct 3, 2016

View reviewed changes

tkelman reviewed Oct 4, 2016

View reviewed changes

simonbyrne reviewed Oct 4, 2016

View reviewed changes

timholy mentioned this pull request Oct 4, 2016

Support fma (fused multiply-add) PainterQubits/Unitful.jl#31

Merged

Clarify NEWS on StepRangeLen [ci skip]

a12e587

timholy merged commit 32d0645 into master Jan 26, 2017

timholy deleted the teh/linspace branch January 26, 2017 19:55

tkelman reviewed Jan 27, 2017

View reviewed changes

andreasnoack mentioned this pull request Feb 1, 2017

linspace tests fail on master JuliaLang/Compat.jl#315

Closed

timholy mentioned this pull request Feb 1, 2017

float ranges: if a+n*s == b, length(a:s:b) should be n+1 #20373

Closed

timholy mentioned this pull request Feb 17, 2017

ERROR: ArgumentError: the length of each axis must match the corresponding size of data JuliaImages/Images.jl#590

Closed

timholy mentioned this pull request Jun 11, 2017

Fix interval indexing with offset axes JuliaArrays/AxisArrays.jl#90

Closed

stevengj mentioned this pull request Jun 15, 2017

display of linspace with BigFloat argument throws #22380

Closed

timholy mentioned this pull request Jul 4, 2017

WIP: Added StepRangeLen JuliaLang/Compat.jl#375

Closed

pablosanjose mentioned this pull request Oct 27, 2017

Sequences aren’t computed properly #24357

Closed

StefanKarpinski pushed a commit that referenced this pull request Feb 8, 2018

Update to linspace changes in #18777

4e8b7b3

Make trunc, round, floor, ceil testing more precise

tlnagy mentioned this pull request Jun 3, 2018

use improved high numerical precision linspace from Base JuliaStats/KernelDensity.jl#52

Merged

timholy mentioned this pull request Sep 15, 2019

Define interval arithmetics using broadcasting JuliaMath/IntervalSets.jl#55

Open

stevengj mentioned this pull request Dec 21, 2019

Possible documentation error: range function docstring at julia/base/range.jl #33798

Closed

ffevotte mentioned this pull request Aug 30, 2020

More accurate Base.lerpi for higher-precision numbers #37281

Open

timholy mentioned this pull request May 1, 2023

base/twiceprecision.jl needs work #49589

Open

Make LinSpace generic #18777

Make LinSpace generic #18777

Conversation

timholy commented Oct 3, 2016 • edited Loading

Mind-reading

Precision

Test changes and deletions

Performance

Future changes to Ranges

timholy commented Oct 3, 2016

yuyichao commented Oct 3, 2016

timholy commented Oct 3, 2016

jrevels commented Oct 3, 2016

musm Oct 3, 2016

Choose a reason for hiding this comment

timholy Oct 3, 2016

Choose a reason for hiding this comment

timholy Oct 3, 2016

Choose a reason for hiding this comment

musm commented Oct 3, 2016 • edited Loading

timholy commented Oct 3, 2016

timholy commented Oct 3, 2016

timholy commented Oct 3, 2016

StefanKarpinski commented Oct 3, 2016 • edited Loading

nanosoldier commented Oct 3, 2016

timholy commented Oct 4, 2016

timholy commented Oct 4, 2016 • edited Loading

tkelman commented Oct 4, 2016 • edited Loading

tkelman commented Oct 4, 2016

tkelman Oct 4, 2016 • edited Loading

Choose a reason for hiding this comment

simonbyrne commented Oct 4, 2016

simonbyrne Oct 4, 2016

Choose a reason for hiding this comment

simonbyrne commented Oct 4, 2016

StefanKarpinski commented Oct 4, 2016

timholy commented Oct 4, 2016

timholy commented Oct 5, 2016 • edited Loading

timholy commented Oct 5, 2016

simonbyrne commented Oct 5, 2016

simonbyrne commented Oct 5, 2016

timholy commented Oct 5, 2016 • edited Loading

timholy commented Jan 26, 2017

timholy commented Jan 26, 2017

StefanKarpinski commented Jan 26, 2017

timholy commented Jan 26, 2017

StefanKarpinski commented Jan 26, 2017 • edited Loading

tkelman Jan 27, 2017

Choose a reason for hiding this comment

timholy Jan 27, 2017

Choose a reason for hiding this comment

andreasnoack commented Jan 27, 2017

StefanKarpinski commented Jan 27, 2017

timholy commented Jan 27, 2017

ajkeller34 commented Jan 29, 2017

timholy commented Jan 29, 2017

timholy commented Oct 3, 2016 •

edited

Loading

musm commented Oct 3, 2016 •

edited

Loading

StefanKarpinski commented Oct 3, 2016 •

edited

Loading

timholy commented Oct 4, 2016 •

edited

Loading

tkelman commented Oct 4, 2016 •

edited

Loading

tkelman Oct 4, 2016 •

edited

Loading

timholy commented Oct 5, 2016 •

edited

Loading

timholy commented Oct 5, 2016 •

edited

Loading

StefanKarpinski commented Jan 26, 2017 •

edited

Loading