Skip to content

Commit af1db19

Browse files
Automate release process (ridiculousfish#124)
* Add patch version to conform to full semantic versioning spec and build LIBDIVIDE_VERSION from the version components * GitHub Workflow to create a release PR * Add GH acation to create a draft release * Generate a run summary that will be displayed in the GH UI. * Update documentation: separate out development instructions from usage and use relative links everywhere. * Typos, prompts, alignment * Embed version string directly in libdivide.h * Remove space from filename * Example commands: remove prompts for easier copy/paste, use 'pwsh' tag for better syntax highlighting * Add back original doc from ridiculousfish * Move doc to RELEASE.md * Improve release type descriptions * Additional release process details. * Fix typo --------- Co-authored-by: Kim Walisch <[email protected]>
1 parent 1c610b9 commit af1db19

File tree

7 files changed

+156
-71
lines changed

7 files changed

+156
-71
lines changed

.github/release.yml

+8
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,8 @@
1+
# .github/release.yml
2+
3+
# Configure automatic release note generation
4+
# See https://docs.github.com/en/repositories/releasing-projects-on-github/automatically-generated-release-notes
5+
changelog:
6+
exclude:
7+
labels:
8+
- ignore-for-release

.github/workflows/prepare_release.yml

+98
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,98 @@
1+
name: Create draft release
2+
3+
on:
4+
workflow_dispatch:
5+
inputs:
6+
release_type:
7+
description: 'Release Type'
8+
required: true
9+
type: choice
10+
default: 'Minor (E.g. 5.2.1 to 5.3.0)'
11+
options:
12+
- Major (E.g. 5.2.1 to 6.0.0)
13+
- Minor (E.g. 5.2.1 to 5.3.0)
14+
- Patch (E.g. 5.2.1 to 5.2.2)
15+
16+
jobs:
17+
Create-Release:
18+
runs-on: ubuntu-latest
19+
20+
permissions:
21+
# Give the default GITHUB_TOKEN write permission to commit and push the
22+
# added or changed files to the repository.
23+
contents: write
24+
25+
outputs:
26+
release_url: ${{ steps.create-draft-release.outputs.url }}
27+
28+
steps:
29+
- name: Checkout code
30+
uses: actions/checkout@v4
31+
with:
32+
ref: ${{ github.head_ref }}
33+
34+
- name: Update the version
35+
id: bump_version
36+
shell: pwsh
37+
run: |
38+
# Extract current version
39+
$header = Get-Content ./libdivide.h
40+
$major_ver = [int](($header -match "LIBDIVIDE_VERSION_MAJOR")[0] -Split " ")[-1]
41+
$minor_ver = [int](($header -match "LIBDIVIDE_VERSION_MINOR")[0] -Split " ")[-1]
42+
$patch_ver = [int](($header -match "LIBDIVIDE_VERSION_PATCH")[0] -Split " ")[-1]
43+
$current_version=@($major_ver, $minor_ver, $patch_ver) -Join "."
44+
45+
# Increment version
46+
if ("${{ github.event.inputs.release_type }}" -like "Patch*") {
47+
$patch_ver = $patch_ver + 1
48+
} elseif ("${{ github.event.inputs.release_type }}" -like "minor*") {
49+
$minor_ver = $minor_ver + 1
50+
$patch_ver = 0
51+
} else { # Must be major version
52+
$major_ver = $major_ver + 1
53+
$minor_ver = 0
54+
$patch_ver = 0
55+
}
56+
$new_version=@($major_ver, $minor_ver, $patch_ver) -Join "."
57+
58+
# Update header file
59+
$header = $header -replace "#define LIBDIVIDE_VERSION ""\d+\.\d+\.\d+""", "#define LIBDIVIDE_VERSION_MAJOR ""$new_version"""
60+
$header = $header -replace "#define LIBDIVIDE_VERSION_MAJOR \d+", "#define LIBDIVIDE_VERSION_MAJOR $major_ver"
61+
$header = $header -replace "#define LIBDIVIDE_VERSION_MINOR \d+", "#define LIBDIVIDE_VERSION_MINOR $minor_ver"
62+
$header = $header -replace "#define LIBDIVIDE_VERSION_PATCH \d+", "#define LIBDIVIDE_VERSION_PATCH $patch_ver"
63+
$header | Set-Content ./libdivide.h
64+
65+
# Update other files
66+
$file="./library.properties"
67+
$regex = 'version=(\d+\.\d+(\.\d+)?)'
68+
(Get-Content $file) -replace $regex, "version=$new_version" | Set-Content $file
69+
70+
$file="./CMakeLists.txt"
71+
$regex = "set\(LIBDIVIDE_VERSION ""\d+\.\d+(\.\d+)?""\)"
72+
(Get-Content $file) -replace $regex, "set(LIBDIVIDE_VERSION ""$new_version"")" | Set-Content $file
73+
74+
Write-Output "previous_version=$current_version" >> $Env:GITHUB_OUTPUT
75+
Write-Output "version=$new_version" >> $Env:GITHUB_OUTPUT
76+
Write-Output "major=$major_ver" >> $Env:GITHUB_OUTPUT
77+
Write-Output "minor=$minor_ver" >> $Env:GITHUB_OUTPUT
78+
Write-Output "patch=$patch_ver" >> $Env:GITHUB_OUTPUT
79+
80+
# Commit all changed files back to the repository
81+
- name: Commit updated versions
82+
uses: stefanzweifel/git-auto-commit-action@v5
83+
with:
84+
commit_message: Auto increment version to ${{ steps.bump_version.outputs.version }}
85+
86+
# Create draft release
87+
- name: Create draft release
88+
id: create-draft-release
89+
uses: softprops/action-gh-release@v2
90+
with:
91+
name: v${{ steps.bump_version.outputs.version }}
92+
draft: true
93+
generate_release_notes: true
94+
tag_name: v${{ steps.bump_version.outputs.version }}
95+
96+
- name: Generate Summary
97+
run: |
98+
echo "Created [v${{ steps.bump_version.outputs.version }} draft release](${{ steps.create-draft-release.outputs.url }})" >> $GITHUB_STEP_SUMMARY

README.md

+26-63
Original file line numberDiff line numberDiff line change
@@ -1,29 +1,17 @@
11
# libdivide
2+
23
[![Build Status](https://github.com/ridiculousfish/libdivide/actions/workflows/canary_build.yml/badge.svg)](https://github.com/ridiculousfish/libdivide/actions/workflows/canary_build.yml)
34
[![Github Releases](https://img.shields.io/github/release/ridiculousfish/libdivide.svg)](https://github.com/ridiculousfish/libdivide/releases)
45

5-
```libdivide.h``` is a header-only C/C++ library for optimizing integer division.
6-
Integer division is one of the slowest instructions on most CPUs e.g. on
7-
current x64 CPUs a 64-bit integer division has a latency of up to 90 clock
8-
cycles whereas a multiplication has a latency of only 3 clock cycles.
9-
libdivide allows you to replace expensive integer division instructions by
10-
a sequence of shift, add and multiply instructions that will calculate
11-
the integer division much faster.
12-
13-
On current CPUs you can get a **speedup of up to 10x** for 64-bit integer division
14-
and a speedup of up to to 5x for 32-bit integer division when using libdivide.
15-
libdivide also supports [SSE2](https://en.wikipedia.org/wiki/SSE2),
16-
[AVX2](https://en.wikipedia.org/wiki/Advanced_Vector_Extensions) and
17-
[AVX512](https://en.wikipedia.org/wiki/Advanced_Vector_Extensions)
18-
vector division which provides an even larger speedup. You can test how much
19-
speedup you can achieve on your CPU using the [benchmark](#benchmark-program)
20-
program.
6+
```libdivide.h``` is a header-only C/C++ library for optimizing integer division. Integer division is one of the slowest instructions on most CPUs e.g. on current x64 CPUs a 64-bit integer division has a latency of up to 90 clock cycles whereas a multiplication has a latency of only 3 clock cycles. libdivide allows you to replace expensive integer division instructions by a sequence of shift, add and multiply instructions that will calculate the integer division much faster.
217

22-
libdivide is compatible with 8-bit microcontrollers, such as the AVR series: [the CI build includes a AtMega2560 target](test/avr/readme.md). Since low end hardware such as this often do not include a hardware divider, libdivide is particularly useful. In addition to the runtime [C](https://github.com/ridiculousfish/libdivide/blob/master/doc/C-API.md) & [C++](https://github.com/ridiculousfish/libdivide/blob/master/doc/CPP-API.md) APIs, a set of [predefined macros](constant_fast_div.h) and [templates](constant_fast_div.hpp) is included to speed up division by 16-bit constants: division by a 16-bit constant is [not optimized by avr-gcc on 8-bit systems](https://stackoverflow.com/questions/47994933/why-doesnt-gcc-or-clang-on-arm-use-division-by-invariant-integers-using-multip).
8+
On current CPUs you can get a **speedup of up to 10x** for 64-bit integer division and a speedup of up to to 5x for 32-bit integer division when using libdivide. libdivide also supports [SSE2](https://en.wikipedia.org/wiki/SSE2), [AVX2](https://en.wikipedia.org/wiki/Advanced_Vector_Extensions) and [AVX512](https://en.wikipedia.org/wiki/Advanced_Vector_Extensions) vector division which provides an even larger speedup. You can test how much speedup you can achieve on your CPU using the [benchmark](#benchmark-program) program.
9+
10+
libdivide is compatible with 8-bit microcontrollers, such as the AVR series: [the CI build includes a AtMega2560 target](test/avr/readme.md). Since low end hardware such as this often do not include a hardware divider, libdivide is particularly useful. In addition to the runtime [C](doc/C-API.md) & [C++](doc/CPP-API.md) APIs, a set of [predefined macros](constant_fast_div.h) and [templates](constant_fast_div.hpp) is included to speed up division by 16-bit constants: division by a 16-bit constant is [not optimized by avr-gcc on 8-bit systems](https://stackoverflow.com/questions/47994933/why-doesnt-gcc-or-clang-on-arm-use-division-by-invariant-integers-using-multip).
2311

2412
See https://libdivide.com for more information on libdivide.
2513

26-
# C++ example
14+
## C++ example
2715

2816
The first code snippet divides all integers in a vector using integer division.
2917
This is slow as integer division is at least one order of magnitude slower than
@@ -60,7 +48,7 @@ Generally libdivide will give a significant speedup if:
6048
* The divisor is only known at runtime
6149
* The divisor is reused multiple times e.g. in a loop
6250

63-
# C example
51+
## C example
6452

6553
You first need to generate a libdivide divider using one of the ```libdivide_*_gen``` functions (```*```:&nbsp;```s32```,&nbsp;```u32```,&nbsp;```s64```,&nbsp;```u64```)
6654
which can then be used to compute the actual integer division using the
@@ -79,28 +67,19 @@ void divide(int64_t *array, size_t size, int64_t divisor)
7967
}
8068
```
8169
82-
# API reference
70+
## API reference
8371
84-
* [C API](https://github.com/ridiculousfish/libdivide/blob/master/doc/C-API.md)
85-
* [C++ API](https://github.com/ridiculousfish/libdivide/blob/master/doc/CPP-API.md)
72+
* [C API](doc/C-API.md)
73+
* [C++ API](doc/CPP-API.md)
8674
* [Macro Invariant Division](constant_fast_div.h)
8775
* [Template Based Invariant Division](constant_fast_div.hpp)
8876
89-
# Branchfull vs branchfree
77+
## Branchfull vs branchfree
9078
9179
The default libdivide divider makes use of
92-
[branches](https://en.wikipedia.org/wiki/Branch_(computer_science)) to compute the integer
93-
division. When the same divider is used inside a hot loop as in the C++ example section the
94-
CPU will accurately predict the branches and there will be no performance slowdown. Often
95-
the compiler is even able to move the branches outside the body of the loop hence
96-
completely eliminating the branches, this is called loop-invariant code motion.
97-
98-
libdivide also has a branchfree divider type which computes the integer division without
99-
using any branch instructions. The branchfree divider generally uses a few more instructions
100-
than the default branchfull divider. The main use case for the branchfree divider is when
101-
you have an array of different divisors and you need to iterate over the divisors. In this
102-
case the default branchfull divider would exhibit poor performance as the CPU won't be
103-
able to correctly predict the branches.
80+
[branches](https://en.wikipedia.org/wiki/Branch_(computer_science)) to compute the integer division. When the same divider is used inside a hot loop as in the C++ example section the CPU will accurately predict the branches and there will be no performance slowdown. Often the compiler is even able to move the branches outside the body of the loop hence completely eliminating the branches, this is called loop-invariant code motion.
81+
82+
libdivide also has a branchfree divider type which computes the integer division without using any branch instructions. The branchfree divider generally uses a few more instructions than the default branchfull divider. The main use case for the branchfree divider is when you have an array of different divisors and you need to iterate over the divisors. In this case the default branchfull divider would exhibit poor performance as the CPU won't be able to correctly predict the branches.
10483
10584
```C++
10685
#include "libdivide.h"
@@ -124,14 +103,12 @@ Caveats of branchfree divider:
124103
* Unsigned branchfree divider cannot be ```1```
125104
* Faster for unsigned types than for signed types
126105

127-
# Vector division
106+
## Vector division
128107

129108
libdivide supports [SSE2](https://en.wikipedia.org/wiki/SSE2),
130109
[AVX2](https://en.wikipedia.org/wiki/Advanced_Vector_Extensions) and
131110
[AVX512](https://en.wikipedia.org/wiki/Advanced_Vector_Extensions)
132-
vector division on x86 and x64 CPUs. In the example below we divide the packed 32-bit
133-
integers inside an AVX512 vector using libdivide. libdivide supports 32-bit and 64-bit
134-
vector division for both signed and unsigned integers.
111+
vector division on x86 and x64 CPUs. In the example below we divide the packed 32-bit integers inside an AVX512 vector using libdivide. libdivide supports 32-bit and 64-bit vector division for both signed and unsigned integers.
135112

136113
```C++
137114
#include "libdivide.h"
@@ -153,7 +130,7 @@ Note that you need to define one of macros below to enable vector division:
153130
* ```LIBDIVIDE_AVX512```
154131
* ```LIBDIVIDE_NEON```
155132
156-
# Performance tips
133+
## Performance Tips
157134
158135
* If possible use unsigned integer types because libdivide's unsigned division is measurably
159136
faster than its signed division. This is especially true for the branchfree divider.
@@ -165,34 +142,23 @@ Note that you need to define one of macros below to enable vector division:
165142
currently no vector multiplication instructions on x86 to efficiently calculate
166143
64-bit * 64-bit to 128-bit.
167144
168-
# Build instructions
145+
## Build instructions
169146
170-
libdivide has one test program and two benchmark programs which can be built using cmake and
171-
a recent C++ compiler that supports C++11 or later. Optionally ```libdivide.h``` can also be
172-
installed to ```/usr/local/include```.
147+
libdivide has one test program and two benchmark programs which can be built using cmake and a recent C++ compiler that supports C++11 or later. Optionally ```libdivide.h``` can also be installed to ```/usr/local/include```.
173148
174149
```bash
175150
cmake .
176151
make -j
177152
sudo make install
178153
```
179154

180-
# Tester program
155+
## Tester program
181156

182-
You can pass the **tester** program one or more of the following arguments: ```u32```,
183-
```s32```, ```u64```, ```s64``` to test the four cases (signed, unsigned, 32-bit, or 64-bit), or
184-
run it with no arguments to test all four. The tester will verify the correctness of libdivide
185-
via a set of randomly chosen numerators and denominators, by comparing the result of libdivide's
186-
division to hardware division. It will stop with an error message as soon as it finds a
187-
discrepancy.
157+
You can pass the **tester** program one or more of the following arguments: ```u32```, ```s32```, ```u64```, ```s64``` to test the four cases (signed, unsigned, 32-bit, or 64-bit), or run it with no arguments to test all four. The tester will verify the correctness of libdivide via a set of randomly chosen numerators and denominators, by comparing the result of libdivide's division to hardware division. It will stop with an error message as soon as it finds a discrepancy.
188158

189-
# Benchmark program
159+
## Benchmark program
190160

191-
You can pass the **benchmark** program one or more of the following arguments: ```u16```, ```s16```, ```u32```,
192-
```s32```, ```u64```, ```s64``` to compare libdivide's speed against hardware division.
193-
**benchmark** tests a simple function that inputs an array of random numerators and a single
194-
divisor, and returns the sum of their quotients. It tests this using both hardware division, and
195-
the various division approaches supported by libdivide, including vector division.
161+
You can pass the **benchmark** program one or more of the following arguments: ```u16```, ```s16```, ```u32```, ```s32```, ```u64```, ```s64``` to compare libdivide's speed against hardware division. **benchmark** tests a simple function that inputs an array of random numerators and a single divisor, and returns the sum of their quotients. It tests this using both hardware division, and the various division approaches supported by libdivide, including vector division.
196162

197163
It will output data like this:
198164

@@ -207,9 +173,7 @@ It will output data like this:
207173
...
208174
```
209175

210-
It will keep going as long as you let it, so it's best to stop it when you are happy with the
211-
denominators tested. These columns have the following significance. All times are in
212-
nanoseconds, lower is better.
176+
It will keep going as long as you let it, so it's best to stop it when you are happy with the denominators tested. These columns have the following significance. All times are in nanoseconds, lower is better.
213177

214178
```
215179
#: The divisor that is tested
@@ -222,10 +186,9 @@ vec_bf: libdivide time, using vector branchfree division
222186
algo: The algorithm used.
223187
```
224188

225-
The **benchmark** program will also verify that each function returns the same value,
226-
so benchmark is valuable for its verification as well.
189+
The **benchmark** program will also verify that each function returns the same value, so benchmark is valuable for its verification as well.
227190

228-
# Contributing
191+
## Contributing
229192

230193
Although there are no individual unit tests, the supplied ```cmake``` builds do include several safety nets:
231194

doc/RELEASE.md

+13
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,13 @@
1+
## How to do a new libdivide release
2+
3+
Releases are semi-automated using GitHub actions:
4+
5+
1. Manually run the [Create draft release](https://github.com/ridiculousfish/libdivide/actions/workflows/prepare_release.yml) workflow/action.
6+
* Choose the branch to release from (usually ```master```) and the release type (based on [Semantic Versioning](https://semver.org/))
7+
* The action will do some codebase housekeeping and create a draft release:
8+
* Creates a new commit with updated version numbers in ```libdivide.h```, ```CMakeLists.txt```, ```library.properties```.
9+
* Creates a draft Git tag of format vX.Y.Z.
10+
2. Once the action is complete, follow the output link in the action summary to the generated draft release. E.g. ![image](https://github.com/user-attachments/assets/7e8393f7-f204-4b3a-af37-de5e187479dc)
11+
3. Edit the generated release notes as needed & publish
12+
13+
Note that PRs with the ```ignore-for-release``` label are excluded from the generated release notes.

libdivide.h

+3-1
Original file line numberDiff line numberDiff line change
@@ -11,9 +11,11 @@
1111
#ifndef LIBDIVIDE_H
1212
#define LIBDIVIDE_H
1313

14-
#define LIBDIVIDE_VERSION "5.1"
14+
// *** Version numbers are auto generated - do not edit ***
15+
#define LIBDIVIDE_VERSION "5.1.0"
1516
#define LIBDIVIDE_VERSION_MAJOR 5
1617
#define LIBDIVIDE_VERSION_MINOR 1
18+
#define LIBDIVIDE_VERSION_PATCH 0
1719

1820
#include <stdint.h>
1921

test/avr/readme.md

+7-7
Original file line numberDiff line numberDiff line change
@@ -10,12 +10,12 @@
1010

1111
## Running the Test program
1212

13-
The test program is in the 'megaatmega2560_Test' environment.
13+
The test program is the 'megaatmega2560_sim_unittest' environment.
1414

15-
To run the test program in a simulator:
16-
1. On the activity bar, select PlatformIO
17-
2. Run Project Tasks -> megaatmega2560_Test -> Custom -> Simulate
18-
a. This will build the test program & launch it in the simulator (this might download )supporting packages)
19-
b. **NOTE** Once running it can take a **long** time for ouput to appear in the terminal. **Be patient**
20-
* Or copy the simavr command line from the terminal to a command prompt (or another vscode terminal)
15+
To run the test program in a simulator (no hardware required!):
16+
17+
1. On the activity bar, select PlatformIO
18+
2. Run Project Tasks -> megaatmega2560_sim_unittest -> Advanced -> Test
19+
1. This will build the test program & launch it in the simulator (this might download supporting packages)
20+
2. **NOTE** Once running it can take a **long** time for ouput to appear in the terminal. **Be patient**
2121

test/tester.cpp

+1
Original file line numberDiff line numberDiff line change
@@ -100,6 +100,7 @@ extern "C" int main(int argc, char *argv[]) {
100100
}
101101
}
102102

103+
std::cout << "Testing libdivide v" << LIBDIVIDE_VERSION << std::endl;
103104
std::string vecTypes = "";
104105
#if defined(LIBDIVIDE_SSE2)
105106
vecTypes += "sse2 ";

0 commit comments

Comments
 (0)