Skip to content
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.

[Vote] Softmax and Loss Convention #434

Closed
tqchen opened this issue Oct 30, 2015 · 9 comments
Closed

[Vote] Softmax and Loss Convention #434

tqchen opened this issue Oct 30, 2015 · 9 comments

Comments

@tqchen
Copy link
Member

tqchen commented Oct 30, 2015

According to discussions in #426
We agreed in all case, the multi-output and single output softmax can be combined into one-class, maybe overloaded by shape of label.

  • _Option 1_
    • SoftmaxOutput for output without gradient attached.
    • After attach a loss, XXXOutput will be able to backprop gradient of the loss, while forward behavior remains unchanged.
    • Softmax remains the same, with loss already attached.
    • May need to introduce attach_loss and maybe special loss class.
  • _Option 2_
    • SoftmaxOutput for output, with specific backward behavior of cross-entropy-loss
      • As an alternative task based naming instead of math based: MulticlassProbOutput to be clear to user.
    • Softmax behaves normally(only take input), and can prop gradient back from any output source(being able to compose as internal node)
    • CrossEntropyLoss can be composed with anything, including Softmax, to get loss in forward and gradient in backward.

Please edit this post to add more options .

@tqchen
Copy link
Member Author

tqchen commented Oct 30, 2015

I vote for 2

@piiswrong
Copy link
Contributor

I vote for 2 with the following name change:
SoftmaxOutput -> SoftmaxLoss
Softmax -> activationop with option "softmax" since its really just an activation

@antinucleon
Copy link
Contributor

+1 for 2

@tqchen
Copy link
Member Author

tqchen commented Oct 31, 2015

@pluskid @mli

@mli
Copy link
Member

mli commented Oct 31, 2015

+1 for 2

@pluskid
Copy link
Contributor

pluskid commented Oct 31, 2015

Sorry currently traveling. I think 1 is a more unified behavior but requires more changes to the internals (eg attach-loss etc). I'm generally ok with either design.

@tqchen
Copy link
Member Author

tqchen commented Nov 1, 2015

#444 #450 Imposes the changes for python and R

@tqchen
Copy link
Member Author

tqchen commented Nov 1, 2015

@hjk41 can you make a windows build for the most recent version?

@tqchen
Copy link
Member Author

tqchen commented Nov 2, 2015

all sides updated

@tqchen tqchen closed this as completed Nov 2, 2015
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

5 participants