Mask specific elements in a final layer

I am now reproducing the following model which outputs an action and uses filter for filtering inappropriate candidates.

In this model, output is filtered after last softmax layer. Let’s assume action_size==3. So the output after dense & asoftmax layer is like below.

output: [0.1, 0.7, 0.2]
filter: [0, 1, 1]
output*filter: [0, 0.7, 0.2]

But in pytorch, logsoftmax is preferred with NLLLoss. So my output is like below. This doesn’t make sense.

output: [-5.4, -0.2, -4.9]
filter: [0, 1, 1]
output*filter: [0, -0.2, -4.9] 

So pytoroch doesn’t recommend vanilla Softmax. How should I apply mask to eliminate specific actions?
Or is there any categorical cross entropy loss functions with vanilla Softmax?

This module doesn’t work directly with NLLLoss, which expects the Log to be computed between the Softmax and itself. Use Logsoftmax instead (it’s faster and has better numerical properties).

I implemented categorical cross entropy loss function by myself. And it is working with nn.Softmax not log_softmax.
Does anyone check my loss function?

# last layer
y = F.softmax(y, -1)
y = y * filter
return y
def categorical_cross_entropy(preds, labels):
    loss = Variable(torch.zeros(1))
    for p, label in zip(preds, labels):
        loss -= torch.log(p[label] + 1.e-7).cpu()
    loss /= preds.size(0)
    return loss