Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor package into one part dealing with LLVM and one part that builds a Vec on top of that #63

Merged
merged 20 commits into from
Mar 31, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
20 commits
Select commit Hold shift + click to select a range
83a7471
This PR pretty much rewrites the package from scratch (with the
KristofferC Feb 11, 2020
68c0b2d
add a warning and two explicit inlines
KristofferC Feb 13, 2020
e3321e5
add functions for doing saturated adds and subs
KristofferC Feb 13, 2020
bf71b69
fix supported element types
KristofferC Feb 22, 2020
f237a54
improve typeinfo propagation
KristofferC Feb 22, 2020
9efd543
overload bitreverse when it exists in Base (1.5)
KristofferC Feb 22, 2020
86805c2
add some more docs to reduce
KristofferC Feb 22, 2020
bca864d
move asserts in generated functions to return an error expression ins…
KristofferC Feb 22, 2020
bbd89f8
restrict eltypes in tuples and varags for Vec constructor
KristofferC Feb 22, 2020
0706d0f
add docs for saturation arithmetic
KristofferC Feb 22, 2020
5167673
add overflow arithmetic
KristofferC Feb 22, 2020
5fb86c2
throw when trying to call mul with overflow on Int64 on i686 because
KristofferC Feb 22, 2020
67372bf
add conversion from Bool
KristofferC Feb 22, 2020
5de3b15
fix some required uses of propagate_inbounds (https://github.com/Juli…
KristofferC Feb 23, 2020
8204863
add a note that the readme example is not meant to beat scalar version
KristofferC Feb 23, 2020
bdfd585
add fast math options to intrinsics and hook into fastmath macro (#1)
Mar 4, 2020
05949bc
add an extra fastmath test
KristofferC Mar 5, 2020
37b2340
fix some boundschecks
Mar 22, 2020
e8f5815
add docs for fastmath
KristofferC Mar 23, 2020
90d54fd
this release should be non-breaking
Mar 31, 2020
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 2 additions & 7 deletions .travis.yml
Original file line number Diff line number Diff line change
Expand Up @@ -3,14 +3,9 @@ os:
- osx
- linux
julia:
- 0.7
- 1.0
- 1.2
- 1.4
- nightly
notifications:
email: false
script:
- if [ -a .git/shallow ]; then git fetch --unshallow; fi
- julia -e 'using Pkg; Pkg.build(); Pkg.test(coverage=true)';
after_success:
- julia -e 'cd(Pkg.dir("SIMD")); Pkg.add("Coverage"); using Coverage; Coveralls.submit(Coveralls.process_folder())';
- julia -e 'Pkg.add("Coverage"); using Coverage; Coveralls.submit(Coveralls.process_folder())';
2 changes: 1 addition & 1 deletion LICENSE.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
The SIMD.jl package is licensed under the Simplified "2-clause" BSD License:

> Copyright (c) 2016: Erik Schnetter.
> Copyright (c) 2016-2020: Erik Schnetter, Kristoffer Carlsson, Julia Computing
> All rights reserved.
>
> Redistribution and use in source and binary forms, with or without
Expand Down
6 changes: 3 additions & 3 deletions Project.toml
Original file line number Diff line number Diff line change
@@ -1,10 +1,10 @@
name = "SIMD"
uuid = "fdea26ae-647d-5447-a871-4b548cad5224"
authors = ["Erik Schnetter <[email protected]>"]
version = "2.8.0"
authors = ["Erik Schnetter <[email protected]>", "Kristoffer Carlsson <[email protected]>"]
version = "2.9.0"

[compat]
julia = "1"
julia = "1.4"

[extras]
InteractiveUtils = "b77e0a4c-d291-57a0-90e8-8db25a27a240"
Expand Down
91 changes: 85 additions & 6 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,12 @@ function vadd!(xs::Vector{T}, ys::Vector{T}, ::Type{Vec{N,T}}) where {N, T}
end
end
```

To simplify this example code, the vector type that should be used (`Vec{N,T}`) is passed in explicitly as additional type argument. This routine is e.g. called as `vadd!(xs, ys, Vec{8,Float64})`.
Note that this code is not expected to outperform the standard scalar way of
doing this operation since the Julia optimizer will easily rewrite that to use
SIMD under the hood. It is merely shown as an illustration of how to load and
store data into `Vector`s using SIMD.jl

## SIMD vector operations

Expand All @@ -46,14 +51,13 @@ The SIMD package provides the usual arithmetic and logical operations for SIMD v

`abs cbrt ceil copysign cos div exp exp10 exp2 flipsign floor fma inv isfinite isinf isnan issubnormal log log10 log2 muladd rem round sign signbit sin sqrt trunc vifelse`

(Currently missing: `count_ones count_zeros exponent ldexp leading_ones leading_zeros significand trailing_ones trailing_zeros`, many trigonometric functions)

(Also currently missing: Type conversions, reinterpretation that changes the vector size)
(Currently missing: `exponent ldexp significand`, many trigonometric functions)

These operators and functions are always applied element-wise, i.e. they are applied to each element in parallel, yielding again a SIMD vector as result. This means that e.g. multiplying two vectors yields a vector, and comparing two vectors yields a vector of booleans. This behaviour might seem strange and slightly unusual, but corresponds to the machine instructions provided by the hardware. It is also what is usually needed to vectorize loops.

The SIMD package also provides conversion operators from scalars and tuples to SIMD vectors and from SIMD vectors to tuples. Additionally, there are `getindex` and `setindex` functions to access individual vector elements. SIMD vectors are immutable (like tuples), and `setindex` (note there is no exclamation mark at the end of the name) thus returns the modified vector.
```Julia

```julia
# Create a vector where all elements are Float64(1):
xs = Vec{4,Float64}(1)

Expand All @@ -63,7 +67,7 @@ ys1 = NTuple{4,Float32}(ys)
y2 = ys[2] # getindex

# Update one element of a vector:
ys = setindex(ys, 5, 3) # cannot use ys[3] = 5
ys = Base.setindex(ys, 5, 3) # cannot use ys[3] = 5
eschnett marked this conversation as resolved.
Show resolved Hide resolved
```

## Reduction operations
Expand All @@ -73,12 +77,87 @@ Reduction operations reduce a SIMD vector to a scalar. The following reduction o
`all any maximum minimum sum prod`

Example:
```Julia

```julia
v = Vec{4,Float64}((1,2,3,4))
sum(v)
10.0
```

It is also possible to use reduce with bit operations:

```julia
julia> v = Vec{4,UInt16}((1,2,3,4))
<4 x UInt16>[0x0001, 0x0002, 0x0003, 0x0004]

julia> reduce(|, v)
0x0007

julia> reduce(&, v)
0x0000
```

## Overflow operations

Overflow operations do the operation but also give back a flag that indicates
whether the result of the operation overflowed.
Note that these only work on Julia with LLVM 9 or higher (Julia 1.5 or higher):
The functions `Base.Checked.add_with_overflow`, `Base.Checked.sub_with_overflow`,
`Base.Checked.mul_with_overflow` are extended to work on `Vec`. :

```julia
julia> v = Vec{4, Int8}((40, -80, 70, -10))
<4 x Int8>[40, -80, 70, -10]

julia> Base.Checked.add_with_overflow(v, v)
(<4 x Int8>[80, 96, -116, -20], <4 x Bool>[0, 1, 1, 0])

julia> Base.Checked.add_with_overflow(Int8(-80), Int8(-80))
(96, true)

julia> Base.Checked.sub_with_overflow(v, 120)
(<4 x Int8>[-80, 56, -50, 126], <4 x Bool>[0, 1, 0, 1])

julia> Base.Checked.mul_with_overflow(v, 2)
(<4 x Int8>[80, 96, -116, -20], <4 x Bool>[0, 1, 1, 0])
```

## Saturation arithmetic

Saturation arithmetic is a version of arithmetic in which operations are limited
to a fixed range between a minimum and maximum value. If the result of an
operation is greater than the maximum value, the result is set (or “clamped”) to
this maximum. If it is below the minimum, it is clamped to this minimum.


```julia
julia> v = Vec{4, Int8}((40, -80, 70, -10))
<4 x Int8>[40, -80, 70, -10]

julia> SIMD.add_saturate(v, v)
<4 x Int8>[80, -128, 127, -20]

julia> SIMD.sub_saturate(v, 120)
<4 x Int8>[-80, -128, -50, -128]
```

## Fastmath

SIMD.jl hooks into the `@fastmath` macro so that operations in a
`@fastmath`-block sets the `fast` flag on the floating point intrinsics
that supports it operations. Compare for example the generated code for the
following two functions:

```julia
f1(a, b, c) = a * b - c * 2.0
f2(a, b, c) = @fastmath a * b - c * 2.0
V = Vec{4, Float64}
code_native(f1, Tuple{V, V, V}, debuginfo=:none)
code_native(f2, Tuple{V, V, V}, debuginfo=:none)
```

The normal caveats for using `@fastmath` naturally applies.

## Accessing arrays

When using explicit SIMD vectorization, it is convenient to allocate arrays still as arrays of scalars, not as arrays of vectors. The `vload` and `vstore` functions allow reading vectors from and writing vectors into arrays, accessing several contiguous array elements.
Expand Down
1 change: 0 additions & 1 deletion REQUIRE

This file was deleted.

5 changes: 2 additions & 3 deletions appveyor.yml
Original file line number Diff line number Diff line change
@@ -1,8 +1,6 @@
environment:
matrix:
- julia_version: 0.7
- julia_version: 1.0
- julia_version: 1.2
- julia_version: 1.4
- julia_version: nightly

platform:
Expand Down Expand Up @@ -42,3 +40,4 @@ test_script:
on_success:
- echo "%JL_CODECOV_SCRIPT%"
- C:\julia\bin\julia -e "%JL_CODECOV_SCRIPT%"

Loading