I am now reproducing the following model which outputs an action
and uses filter
for filtering inappropriate candidates.
In this model, output is filtered after last softmax layer. Let’s assume action_size==3
. So the output after dense & asoftmax layer is like below.
output: [0.1, 0.7, 0.2]
filter: [0, 1, 1]
output*filter: [0, 0.7, 0.2]
But in pytorch, logsoftmax
is preferred with NLLLoss
. So my output is like below. This doesn’t make sense.
output: [-5.4, -0.2, -4.9]
filter: [0, 1, 1]
output*filter: [0, -0.2, -4.9]
So pytoroch doesn’t recommend vanilla Softmax
. How should I apply mask to eliminate specific actions?
Or is there any categorical cross entropy loss functions with vanilla Softmax?
This module doesn’t work directly with NLLLoss, which expects the Log to be computed between the Softmax and itself. Use Logsoftmax instead (it’s faster and has better numerical properties).
torch.nn — PyTorch master documentation