debug filterShapes and match network to citation #158

mikowals · 2019-05-16T03:41:56Z

If widenFactor is greater than 1, the input shape of the first residual block had the wrong number of input channels. Similarly the same filterShape cannot be used for both conv1 and conv2 when the number of channels is being increased.

Added identity connections, dropout between convolutions, and the use of preactivation in the 1x1 conv2D shortcuts.

The paper also uses a weight decay of 5e-4. I can add a function for this but I have not seen any implementations of L2 decay in any of the models so I wondered if there was some plan to add it somewhere else in the API.

If widenFactor is greater than 1, the input shape of the first residual block had the wrong number of input channels. Similarly the same filterShape cannot be used for both conv1 and conv2 when the number of channels is being increased. Added identity connections, dropout between convolutions, and the use of preactivation in the 1x1 conv2D shortcuts. The paper also uses a weight decay of 5e-4. I can add a function for this but I have not seen any implementations of L2 decay in any of the models so I wondered if there was some plan to add it somewhere else in the API.

brettkoonce · 2019-05-16T18:08:23Z

Thanks for debugging this + adding dropout! I was working off of this implementation for reference, is that where I missed the identity layer?

Re weight decay, I haven't seen a pattern yet for defining model-specific optimizer values, if you have any suggestions/ideas I'm sure they're welcome!

mikowals · 2019-05-19T01:55:30Z

I just spent a couple of days trying to train the network. Ultimately I discovered a bug in _vjpRsqrt which is called in the gradient calculation of the BatchNorm layer.

Now it looks like the changes to autodifferentiation code (possibly related to the removal of CotangentVector) is breaking Array.differentiableReduce. I attempted a quick fix replacing all ContangentVector references with TangentVector but no luck. I am working in Colab with "(LLVM 082dec2e22, Swift f0e8864)". The Colab session crashes with no errors displayed building the call function with Blocks.differentiableReduce, so I assume it is the attempt to autodiff that call function that is causing the crash.

Anyway by manually iterating through blocks of a depthFactor 6, widthFactor 4 model. I can see that the network does now train as expected.

googlebot · 2019-05-24T02:04:45Z

We found a Contributor License Agreement for you (the sender of this pull request), but were unable to find agreements for all the commit author(s) or Co-authors. If you authored these, maybe you used a different email address in the git commits than was used to sign the CLA (login here to double check)? If these were authored by someone else, then they will need to sign a CLA as well, and confirm that they're okay with these being contributed to Google.
In order to pass this check, please resolve this problem and have the pull request author add another comment and the bot will run again. If the bot doesn't comment, it means it doesn't think anything has changed.

ℹ️ Googlers: Go here for more info.

rxwei · 2019-05-24T02:12:14Z

I’ll take a look at the differentiableReduce breakage tonight.

mikowals · 2019-05-24T02:29:15Z

This pull request runs with the differentiable reduce as updated by Brett.

The crash I was seeing was because I had added an L2 loss locally. It appears to be also related to the differentiableReduce and the updated autodiff code also but I will try to create a minimal reproduction and file the bug separately.

mikowals · 2019-05-24T05:33:41Z

I have created TF-533 for the differentiableReduce bug mentioned above.

…orflow#158) * Moving the tests for TensorGroup from swift repo to swift-apis. * Fix indentation and whitespace issues.

mikowals · 2019-08-07T23:13:43Z

replaced by #193.

merge upstream

596a5ea

mikowals mentioned this pull request Aug 7, 2019

WideResNet - fix widenFactor and match model to citation #193

Merged

mikowals closed this Aug 7, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

debug filterShapes and match network to citation #158

debug filterShapes and match network to citation #158

Uh oh!

mikowals commented May 16, 2019

Uh oh!

brettkoonce commented May 16, 2019

Uh oh!

mikowals commented May 19, 2019

Uh oh!

googlebot commented May 24, 2019

Uh oh!

rxwei commented May 24, 2019

Uh oh!

mikowals commented May 24, 2019

Uh oh!

mikowals commented May 24, 2019

Uh oh!

mikowals commented Aug 7, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

debug filterShapes and match network to citation #158

debug filterShapes and match network to citation #158

Uh oh!

Conversation

mikowals commented May 16, 2019

Uh oh!

brettkoonce commented May 16, 2019

Uh oh!

mikowals commented May 19, 2019

Uh oh!

googlebot commented May 24, 2019

Uh oh!

rxwei commented May 24, 2019

Uh oh!

mikowals commented May 24, 2019

Uh oh!

mikowals commented May 24, 2019

Uh oh!

mikowals commented Aug 7, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants