- how can i find whether my layer has softmax or not?
nn.CrossEntropyLoss
applies F.log_softmax
internally on the input. The usual “layers” such as nn.ConvXd
, nn.Linear
etc. are not applying any non-linearity for you.
The same does of course not apply for custom user-defined layers.
If you are unsure about a specific layers, please refer to the docs, which would mention if an activation function is applied internally.
Thanks @ptrblck , I read the docs of my github repo they didnt mention about non-linearity function applied internally.
Can you please once go through my github repo code to have a glance whether my softmax function applied to last layer
I am using Imagenet dataset
In the ImageNet example I cannot find the usage of a softmax and nn.CrossEntropyLoss
is used, which looks right.
Thanks @ptrblck for your reply, from this conversation I came to know that softmax is used only for calculating the CrossEntropyLoss rather than classification
Can I print or get the softmax output which is used in CrossEntropyLoss just for curiosity
Yes, you can add a softmax
or log_softmax
operation and e.g. print the output values.
As long as you don’t feed these values into nn.CrossEntropyLoss
there won’t be a problem.
-
I have sent my output to softmax function and I am getting positive value in range(0,1)
-
I have inference code for taking multiple images in a folder, there I don’t have loss function(Cross entropy) neither Softmax Do I want to add cross entropy in my inference code? or it is ok
-
By adding softmax in my inference code will there be any change in my prediction or its(softmax) just used to convert logit into probability?
here is the code for Inference
import torchvision
import torchvision.transforms as transforms
import torchvision.datasets as datasets
normalize = transforms.Normalize(mean=[0.485, 0.456, 0.406],
std=[1./255., 1./255., 1./255.])
torchvision.set_image_backend('PIL')
data_transforms = {
'predict': transforms.Compose([
transforms.Resize((256, 256)),
transforms.CenterCrop(input_size),
transforms.ToTensor(),
normalize,
])
}
dataset = {'predict' : datasets.ImageFolder("/content/XNOR-Net-PyTorch/ImageNet/networks/", data_transforms['predict'])}
dataloader = {'predict': torch.utils.data.DataLoader(dataset['predict'], batch_size=args.batch_size, shuffle=False, num_workers=args.workers,pin_memory=True)}
batch_time = AverageMeter()
losses = AverageMeter()
top1 = AverageMeter()
top5 = AverageMeter()
model.eval()
end = time.time()
global bin_op
bin_op = util.BinOp(model)
bin_op.binarization()
#input=img
for input, labels in dataloader['predict']:
with torch.no_grad():
input_var = torch.autograd.Variable(input)
# compute output
output = model(input_var)
-
That sounds right.
-
If you want to calculate the “test loss” (and have targets during inference), you can use the criterion to calculate it. Otherwise, if you just want to predictions, it’s not necesary.
-
The predictions won’t change and you can get the predicted class index via
torch.argmax(output)
, whereoutput
can be the logits or the probabilities.
Referring to the example by @Ganga, I understand that during training,
the cross entropy loss is obtained and in that function softmax is also calculated before getting the loss.
However, during model inferencing, there is no explicit usage of softmax in the code but it output from model(input)
gives probabilities?
How does the model give probabilities? Where is the softamx declared that when the model is called (for inferencing) softmax is executed on the logits?
You are correct that nn.CrossEntropyLoss
will internally apply F.log_softmax
and thus no softmax
activation is used in the model.
The model thus outputs logits which have values in the range [-Inf, +inf]
.
If you want to calculate the probability for each class during inference, you can apply F.softmax
on the ouput and process them further (just don’t calculate the loss with them).
However, to get the predicted classes, you could use torch.argmax(output, dim=1)
, which will return the same predicted class index using the logits or probabilities, since the softmax
will not change the order of logits and probabilitiies, and the highest logit will get the highest probability.
Hi @ptrblck Is there any advantage to activate the last layer with softmax if I already used CrossEntropyLoss as loss function? Internallly, it uses log_softmax. What do you think?
No, using softmax
on your outputs and passing it to nn.CrossEntropyLoss
is wrong and might stall your training. As described before: you could still use it in case you want to print the probabilities or use them in another way, but don’t calculate the loss with the softmax
output using nn.CrossEntropyLoss
or nn.NLLLoss
.