Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

An attempt to combine dense, depthwise and groupwise conv through DenseConvDims #146

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

arhik
Copy link

@arhik arhik commented Nov 30, 2019

With #JuliaGPU/CuArrays.jl#521 (comment) and
JuliaGPU/CuArrays.jl#523 (comment)

We can define depthwise and groupwise convolutions using DenseConvDims (Naming will become confusing then).

  • I need to make corresponding changes in direct and im2col backend codes.

@staticfloat Please feel to comment on this. Open to any changes or suggestions. I will do my consider your changes in #142.

@staticfloat
Copy link
Contributor

Cool @arhik! To make it easier for you, I just merged #142 so that you can more easily rebase.

Here are my thoughts:

  • We don't need to call this DenseConvDims anymore; it's just ConvDims at this point. :)

  • It would be helpful to have a succinct description of what to do when groupcount is not equal to 1 or input_channels(). E.g. what does groupcount == 7 look like? If you can succinctly describe what should happen, I can help you adjust the implementations of the direct and im2col implementations.

  • Can we make use of NNPACK at all?

  • We're going to need some more tests. :)

@eval begin
function $(Symbol("∇conv_filter$(backend)"))(
x::AbstractArray{xT,N}, dy::AbstractArray{yT,N},
cdims::ConvDims; kwargs...) where {xT, yT, N}
dw = similar(dy, kernel_size(cdims)..., channels_in(cdims),
dw = similar(dy, kernel_size(cdims)..., div(channels_in(cdims),group_count(cdims)),
Copy link
Author

@arhik arhik Dec 8, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be helpful to have a succinct description of what to do when groupcount is not equal to 1 or input_channels(). E.g. what does groupcount == 7 look like? If you can succinctly describe what should happen, I can help you adjust the implementations of the direct and im2col implementations.

@staticfloat. This is only change. Only weights dimensions shrink as in here by groupcount value in third axis. if groupcount == 1 Nothing changes. When groupcount == 2 (lets say), then one group of weights operate only on the half of the input channels. and produce only one output channel. These weights groups have to be occupied across all the channels and we will have to use new group of weights (occupying all input channels in blocks) until their output matches output number of channels.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When groupcount == 7; we should make sure input channels can be divided exactly into 7 groups. Then we should check if output channels are multiple of groupcount(Since one group can only produce one output).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should then distribute these 7 groups to operate on different input channels in blocks of div(input_channels(), 7)

…and removing seperate implementations of Depthwise and Groupwise.
@ToucheSir
Copy link
Member

Can this PR be moved to https://github.com/FluxML/NNlibCUDA.jl, or has it diverged too much to be workable there?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants