Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

set up documentation CI against Julia 1.x #52

Merged
merged 12 commits into from
Sep 3, 2020
33 changes: 33 additions & 0 deletions .github/workflows/docs.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
name: Documentation

on:
push:
branches:
- 'master'
- 'release-'
tags: '*'
pull_request:

jobs:
build:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v2
- name: Cache artifacts
uses: actions/cache@v1
env:
cache-name: cache-artifacts
with:
path: ~/.julia/artifacts
key: ${{ runner.os }}-test-${{ env.cache-name }}-${{ hashFiles('**/Project.toml') }}
restore-keys: |
${{ runner.os }}-test-${{ env.cache-name }}-
${{ runner.os }}-test-
${{ runner.os }}-
- name: Install dependencies
run: julia --project=docs/ -e 'using Pkg; Pkg.develop(PackageSpec(path=pwd())); Pkg.instantiate()'
- name: Build and deploy
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
DOCUMENTER_KEY: ${{ secrets.DOCUMENTER_KEY }} # https://github.com/JuliaDocs/Documenter.jl/issues/1177
run: julia --project=docs/ docs/make.jl
3 changes: 2 additions & 1 deletion docs/.gitignore
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
build/
site/
src/generated/*
src/democards
/Manifest.toml
18 changes: 18 additions & 0 deletions docs/Project.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
[deps]
Augmentor = "02898b10-1f73-11ea-317c-6393d7073e15"
DemoCards = "311a05b2-6137-4a5a-b473-18580a3d38b5"
Documenter = "e30172f5-a6a5-5a46-863b-614d45cd2de4"
FileIO = "5789e2e9-d7fb-5bc7-8068-2c6fae9b9549"
ImageCore = "a09fc81d-aa75-5fe9-8630-4744c3626534"
ImageDraw = "4381153b-2b60-58ae-a1ba-fd683676385f"
ImageMagick = "6218d12a-5da1-5696-b52f-db25d2ecc6d1"
ImageShow = "4e3cecfd-b093-5904-9786-8bbb286a6a31"
Images = "916415d5-f1e6-5110-898d-aaa5f9f070e0"
MLDatasets = "eb30cadb-4394-5ae3-aed4-317e484a6458"
MosaicViews = "e94cdb99-869f-56ef-bcf0-1ae2bcbe0389"
OffsetArrays = "6fe1bfb0-de20-5000-8ca7-80f57d26f881"
PaddedViews = "5432bcbf-9aad-5242-b902-cca2824c8663"

[compat]
DemoCards = "0.2"
Documenter = "0.24"
Binary file added docs/examples/examples/assets/mnist_elastic.gif
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
125 changes: 125 additions & 0 deletions docs/examples/examples/mnist_elastic.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,125 @@
---
title: Elastic distortion to MNIST images
id: mnist_elastic
cover: assets/mnist_elastic.gif
---


In this example we are going to use Augmentor on the famous **MNIST database of handwritten
digits** [^MNIST1998] to reproduce the elastic distortions discussed in [^SIMARD2003].

It may be interesting to point out, that the way Augmentor implements distortions is a little
different to how it is described by the authors of the paper. This is for a couple of reasons,
most notably that we want the parameters for our deformations to be independent of the size of
image it is applied on. As a consequence the parameter-numbers specified in the paper are not
1-to-1 transferable to Augmentor.

If the effects are sensible for the dataset, then applying elastic distortions can be a really
effective way to improve the generalization ability of the network. That said, our implementation
of [`ElasticDistortion`](@ref) has a lot of possible parameters to choose from. To that end, we
will introduce a simple strategy for interactively exploring the parameter space on our dataset of
interest.

## Loading the MNIST Trainingset

In order to access and visualize the MNIST images we employ the help of two additional Julia
packages. In the interest of time and space we will not go into great detail about their
functionality. Feel free to click on their respective names to find out more information about the
utility they can provide.

- [Images.jl](https://github.com/JuliaImages/Images.jl) will provide us with the necessary tools
for working with image data in Julia.

- [MLDatasets.jl](https://github.com/JuliaML/MLDatasets.jl) has an MNIST submodule that offers a
convenience interface to read the MNIST database.

The function `MNIST.traintensor` returns the MNIST training images corresponding to the given
indices as a multi-dimensional array. These images are stored in the native horizontal-major
memory layout as a single floating point array, where all values are scaled to be between 0.0 and
1.0.

```@example mnist_elastic
using Images, MLDatasets
train_tensor = MNIST.traintensor()

summary(train_tensor)
```

This horizontal-major format is the standard way of utilizing this dataset for training machine
learning models. In this tutorial, however, we are more interested in working with the MNIST
images as actual Julia images in vertical-major layout, and as black digits on white background.

We can convert the "tensor" to a `Colorant` array using the provided function
`MNIST.convert2image`. This way, Julia knows we are dealing with image data and can tell
programming environments such as Juypter how to visualize it. If you are working in the terminal
you may want to use the package
[ImageInTerminal.jl](https://github.com/JuliaImages/ImageInTerminal.jl)

```@example mnist_elastic
train_images = MNIST.convert2image(train_tensor)

train_images[:,:,1] # show first image
```

## Visualizing the Effects

Before applying an operation (or pipeline of operations) on some dataset to train a network, we
strongly recommend investing some time in selecting a decent set of hyper parameters for the
operation(s). A useful tool for tasks like this is the package
[Interact.jl](https://github.com/JuliaGizmos/Interact.jl). We will use this package to define a
number of widgets for controlling the parameters to our operation.

Note that while the code below only focuses on configuring the parameters of a single operation,
specifically [`ElasticDistortion`](@ref), it could also be adapted to tweak a whole pipeline. Take
a look at the corresponding section in [High-level Interface](@ref pipeline) for more information
on how to define and use a pipeline.

These two package will provide us with the capabilities to perform interactive visualisations in a
jupyter notebook
using Augmentor, Interact, Reactive

The manipulate macro will turn the parameters of the
loop into interactive widgets.

```julia
@manipulate for
unpaused = true,
ticks = fpswhen(signal(unpaused), 5.),
image_index = 1:100,
grid_size = 3:20,
scale = .1:.1:.5,
sigma = 1:5,
iterations = 1:6,
free_border = true op = ElasticDistortion(grid_size, grid_size, # equal width & height
sigma = sigma,
scale = scale,
iter = iterations,
border = free_border)
augment(train_images[:, :, image_index], op)
end
```

Executing the code above in a Juypter notebook will result
in the following interactive visualisation. You can now
use the sliders to investigate the effects that different
parameters have on the MNIST training images.

!!! tip
You should always use your **training** set to do this
kind of visualisation (not the test test!). Otherwise
you are likely to achieve overly optimistic (i.e. biased)
results during training.

![interact](https://user-images.githubusercontent.com/10854026/30867456-4afe0800-a2dc-11e7-90eb-800b6ea025d0.gif)

Congratulations! With just a few simple lines of code, you
created a simple interactive tool to visualize your image
augmentation pipeline. Once you found a set of parameters that
you think are appropriate for your dataset you can go ahead
and train your model.

## References

[^MNIST1998]: LeCun, Yan, Corinna Cortes, Christopher J.C. Burges. ["The MNIST database of handwritten digits"](http://yann.lecun.com/exdb/mnist/) Website. 1998.

[^SIMARD2003]: Simard, Patrice Y., David Steinkraus, and John C. Platt. ["Best practices for convolutional neural networks applied to visual document analysis."](https://www.microsoft.com/en-us/research/publication/best-practices-for-convolutional-neural-networks-applied-to-visual-document-analysis/) ICDAR. Vol. 3. 2003.
12 changes: 12 additions & 0 deletions docs/examples/index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
# [Tutorials](@id tutorials)

Here we provide several tutorials that you may want to follow up and see how Augmentor is used in
practice.

{{{democards}}}

## References

[^MNIST1998]: LeCun, Yan, Corinna Cortes, Christopher J.C. Burges. ["The MNIST database of handwritten digits"](http://yann.lecun.com/exdb/mnist/) Website. 1998.

[^SIMARD2003]: Simard, Patrice Y., David Steinkraus, and John C. Platt. ["Best practices for convolutional neural networks applied to visual document analysis."](https://www.microsoft.com/en-us/research/publication/best-practices-for-convolutional-neural-networks-applied-to-visual-document-analysis/) ICDAR. Vol. 3. 2003.
147 changes: 0 additions & 147 deletions docs/exampleweaver.jl

This file was deleted.

Loading