Hi,
Wouldn’t it be better if nn.CrossEntropyLoss specified on its name that it is also performing a softmax to the input? (as they do in TF) I hadn’t read the description of this loss function an I was using it wrong, since I was applying a softmax to the output of my network right before CrossEntropyLoss.
Also, my network looks like a typical FFNN (see below). What is the recommended way in pytorch for handling this different network structure for training and inference? (inference should include softmax, training shouldn’t).
Thanks!
class Net(nn.Module):
def __init__(self, num_inputs, num_u_hl1, num_u_hl2, num_outputs, dropout_rate):
super(Net, self).__init__()
self.cl0 = nn.Linear(num_inputs, num_u_hl1)
self.cl1 = nn.Linear(num_u_hl1, num_u_hl2)
self.cl2 = nn.Linear(num_u_hl2, num_outputs)
self.d1 = nn.Dropout(dropout_rate)
self.d2 = nn.Dropout(dropout_rate)
def forward(self, x):
x = self.cl0(x)
x = self.d1(x)
x = F.sigmoid(self.cl1(x))
x = self.d2(x)
x = F.softmax(self.cl2(x))
return x