Replicate paper CNN by using pytorch

The last nn.Softmax layer could be wrong, if you are using nn.CrossEntropyLoss or nn.NLLLoss.
The former expects raw logits, so remove this activation, while the latter expects log probabilities, so use nn.LogSoftmax instead.

The difference in the spatial size could come from a different padding setup in your conv layers.
Also, the quoted section doesn’t mention anything about the stride of the pooling layers, which would also influence the activation shape.

1 Like