nn.Softmax is an nn.Module, which can be initialized e.g. in the __init__ method of your model and used in the forward.
torch.softmax() (I assume nn.softmax is a typo, as this function is undefined) and nn.functional.softmax are equal and I would recommend to stick to nn.functional.softmax, since itās documented. @tom gives a better answer here.
If you write for re-use, the functional / Module split of PyTorch has turned out
to be a good idea.
Use functional for stuff without state (unless you have a quick and dirty Sequential).
Never re-use modules (define one torch.nn.ReLU and use it 5 times). Itās a trap!
When doing analysis or quantization (when ReLU becomes stateful due to quantization params), this will break.
The latter two seem to cover your case here, but 2. is more a matter of personal preference, while 3. is really about writing better (as in less risky, more clear) code.
Item 1. ist more is more for when you write a library or something you expect to be re-used a lot.
Thank you for your answer but Iām still not sure what are the differences of them.
So can I say that functional is for re-use? What do āstuff without stateā and āquantizationā mean? Which should I use between functional and Module?
Thank you for your kind answer.I understand the āstateā. I guess it refers a weight in a neural network right? Quantization is something Iāve never seen so I need to see youre reference. However, Iām little bit confuse becuase I think there is no conclusion for some kind of a threshold that using functional or Module. Maybe thatās why Iām not an native english speaker.
I apologise for the inconvenience but if Iām not asking you this question, I will never know. Please excuse me for bothering you. And I want to quote what you mentioned just like you did to mine, but I donāt know how to do it so I just write them down.
That is more āif you implement something new, provide both because you donāt know which is more convenient to your usersā
I donāt want to make an activation function. I just want to know proper way to use them (softmax, ReLU, whatever it is). However, I guess there is no clear standard for it. The functional are not saved in the stae of a neural network so I should avoid it if I want to save them as state_dict right?
For softmax, and if you arenāt building a Sequential , Iād use the functional interface.
Why would you do that? Whatās the beneficial of using functional interface instead of Module?
I think about modules as holding state (e.g. weights, as you mention) and using that and the inputs I pass to produce the output. So for softmax, using the functional interface expresses how I think about it (i.e. as a function of the inputs and no weights etc.).
But donāt make it a science, just go with what you prefer.