ShuffleNet classification

M_S · June 26, 2019, 10:27am

Hi,
I did a small experiment with the ShuffleNetV2 model, but I’m not sure about its results.
From the torchvision models, I loaded aShuffleNetV2 instance pretrained on ImageNet. So I have a classification network across 1000 classes, right?

Then, I replaced the last FC layer:

model.fc = Linear(in_features=model.fc.in_features, out_features=1, bias=(model.fc.bias is not None))

Also I called reset_parameters() on all layers.

So now, supposedly, I have a random network for binary classification, right? So its output should vary between 0 and 1.
But no. I figured that since the classification network probably has a softmax operator after the FC layer, then for a single output, I will got constantly 1.
Then I tested my hypothesis with random inputs, and as results I get negative values!

model(torch.randn(1,3,300,300))
Out[310]: tensor([[-0.0318]], grad_fn=<AddmmBackward>)

How can it be?

ptrblck · June 26, 2019, 10:33am

If you set the number of output using to 1, you should use nn.BCEWithLogitsLoss as your criterion.
Also, your target should have the same shape ([batch_size, 1]), have values in [0, 1], and be a FloatTensor.

Alternatively, if you would like to stick to nn.CrossEntropyLoss, you should specify out_features=2, and your target should be a LongTensor, containing the class indices in [0, 1], and have the shape [batch_size].

M_S · June 26, 2019, 10:41am

Thanks. Regardless of the training criterion, I am discussion about an untrained network (as I reset the parameters).
Does the call to the FC layer is followed by a softmax? (I didn’t find the model’s source code)
If so, how come it results to a negative number?

ptrblck · June 26, 2019, 10:43am

No, the model returns the logits (line of code).

M_S · June 26, 2019, 10:48am

I see. Thanks!
So I’ll try to add a Sigmoid layer at the end (to get a probability of belonging to class 1, which then I can compare it to my ground truth class labels with values [0, 1]), and I’ll train using a BCELoss.