Is (ReLU + Softmax) in caffe same with CrossEntropy in Pytorch?

John1231983 · January 23, 2019, 2:30pm

I am reproducing a network that implemented in caffe. The last layer of the nework is

(Caffe) block (n) → BatchNorm → ReLU → SoftmaxWithLoss

I want to reproduce it in pytorch using CrossEntropy Loss. So, Is it right to remove ReLU layer before Softmax Loss because Cross Entropy aleady has it as

(Pytorch) block (n) → BatchNorm → CrossEntropyLoss

vabh · January 23, 2019, 2:33pm

nn.CrossEntropyLoss is LogSoftMax + NLLLoss. So you should not remove the ReLU
See the first line here: https://pytorch.org/docs/stable/nn.html#torch.nn.CrossEntropyLoss

John1231983 · January 23, 2019, 2:37pm

Thanks. But I removed it and it still work. I am wondering which one is correct remove or not remove?