Skip to content

Conversation

qubvel
Copy link
Contributor

@qubvel qubvel commented Oct 14, 2019

Add different swish implementations:

  1. Memory efficient swish
  • GPU memory friendly
  • Less computationally efficient while training
  • Does not supported by torch.jit / torch.onnx
  1. Original swish (x * torch.sigmoid(x))
  • Less memory efficient
  • More computationally efficient while training
  • Model can be saved with torch.jit / torch.onnx

Default: memory efficient
Model swish implementation can be changed by .set_swish(memory_efficient=False/True) method

@lukemelas lukemelas merged commit 8a5da1d into lukemelas:master Oct 15, 2019
@glenn-jocher
Copy link

@qubvel thanks for function! I've tried to implement this in our repo: https://github.com/ultralytics/yolov3, but get worse results (lower mAP and higher loss) when compared to a default Swish() class. Do you know why this might be? See ultralytics/yolov3#441 (comment)

@cswwp
Copy link

cswwp commented Aug 5, 2020

@qubvel If i train with Memory efficient swish, and exporting model.pt with model.set_swish(memory_efficient=False) + torch.jit.trace(model, example), will it hurt the score?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants