-
Notifications
You must be signed in to change notification settings - Fork 47
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
integration with Flux #54
Comments
Am Briefly speaking, Speaking of the Dataloader, although it can provide a similar behavior of that in Pytorch and can be easy to use, I think MLDataUtils is more flexible and sophisticated for this purpose (cc @oxinabox) |
Yep, it would be nice to use MLDataUtils. Last time I checked though packages there were not actively maintained and there were a lot of interdependencies difficult to disentangle. Also, I was not sure it was tailored for deep learning needs, maybe more geared towards dataframes. How do I achieve the same functionality as DataLoader with MLDataUtils? |
Here's a sample from my project code that uses MLDataUtils, it's not ready/well-organized for review... ...
train_X = BatchView(train_X, batch_size, ObsDim.Last()) # WHCN
train_Y = BatchView(train_Y, batch_size, ObsDim.Last()) # WHCN
...
if use_gpu
model = gpu(model)
train_X = mappedarray(CuArray, train_X)
train_Y = mappedarray(CuArray, train_Y)
end
...
Flux.train!(loss, params(model), zip(train_X, train_Y), opt; cb = evalcb) |
I have commented over at Which is the more appropriate place to have this conversation |
I am trying to produce an example of Augmentor and Flux cooperation so that we can put it in our docs. This is what I got: using Augmentor, Flux, MappedArrays, MLDatasets, MLDataUtils, Statistics
n_instances = 50
n_epochs = 32
batch_size = 32
X = MNIST.traintensor(Float32, 1:n_instances)
y = Flux.onehotbatch(MNIST.trainlabels(1:n_instances), 0:9)
pl = ElasticDistortion(6, 6,
sigma=4,
scale=0.3,
iter=3,
border=true) |>
ConvertEltype(Float32) |>
Reshape(28, 28, 1)
outbatch(X) = Array{Float32}(undef, (28, 28, 1, nobs(X)))
augmentbatch((X, y)) = augmentbatch!(outbatch(X), X, pl), y
batches = mappedarray(augmentbatch,
batchview((X, y), maxsize=batch_size))
predict = Chain(Conv((3, 3), 1=>16, pad=(1,1), relu),
MaxPool((2,2)),
Conv((3, 3), 16=>32, pad=(1,1), relu),
MaxPool((2,2)),
Conv((3, 3), 32=>32, pad=(1,1), relu),
MaxPool((2,2)),
flatten,
Dense(288,10))
loss(X, y) = Flux.Losses.logitcrossentropy(predict(X), y)
loss(batches) = mean(b -> loss(b...), batches)
opt = Flux.Optimise.ADAM(0.001)
@info loss(batches)
@Flux.epochs 32 Flux.train!(loss, params(predict), batches, opt)
@info loss(batches) Do you see any flaws? Should I change something? |
LGTM, but I don't use Flux for months so there might be some updates that @CarloLucibello knows. For MNIST, perhaps you might want to use MNIST.traintensor(Float32, 1:10) # (28, 28, 10) Float32 Array
MNIST.trainlabels(1:10) # length 10 vector |
Thanks! I updated the example. |
Can we close this, considering this example has been added to the docs? |
close in favor of #102 |
Hi @Evizero @johnnychen94 ,
I was wondering how to integrate Augmentor in a Flux pipeline. I can think of two options:
transforms
option taking an Augmentor's pipeline, and have Augmentor as a Flux dependency.I think the first option would make for a more simple and streamlined user experience. What do you think?
Is the example here
https://github.com/Evizero/Augmentor.jl/blob/master/examples/mnist_knet.jl
still a valid template?
The text was updated successfully, but these errors were encountered: