It should generally work.
Here is a small example:
class MyModel(nn.Module):
def __init__(self):
super(MyModel, self).__init__()
self.fc = nn.Linear(10, 2)
def forward(self, x):
x = self.fc(x)
return x
model = MyModel()
x = torch.randn(1, 10)
print(model(x))
> tensor([[-0.2403, 0.8158]], grad_fn=<ThAddmmBackward>)
model = nn.Sequential(
model,
nn.Softmax(1)
)
print(model(x))
> tensor([[0.2581, 0.7419]], grad_fn=<SoftmaxBackward>)
As you can see, the output was normalized using softmax in the second call.
Also the grad_fn
points to softmax.
Could you print your model after adding the softmax layer to it?