Multiclass Classification in PyTorch

Hi Everyone,

I’m trying to Finetune the pre-trained convnets (e.g., resnet50) for a data set, which have 3 categories. In fact, I want to extend the introduced code of ‘Transfer Learning tutorial’ (Transfer Learning tutorial) for a new data set which have 3 categories. In addition, in my data set each image has just one label (i.e., each train/val/test image has just one label). Could you help me please to do that?
I have changed the above-mentioned code as follows:

  1. I have changed the parameters of nn.Linear as follow:

num_ftrs = model_conv.fc.in_features
model_conv.fc = nn.Linear(num_ftrs, 3) # 3 means we have 3 class labels

  1. I have changed the Loss function:
    criterion = nn.NLLLoss()

  2. I have changed the ‘train_model’ method as follow:


m = nn.LogSoftmax()
outputs = model(inputs)
_, preds = torch.max(outputs.data, 1)
loss = criterion(m(outputs), labels)

However, my obtained result isn’t good at all. As a result, my precise questions are as follows:

  1. In these cases which Loss function must be used?
  2. Are those changes for training the model and compute the loss correct?
1 Like
  1. Sigmoid followed by BCECriterion
  2. i think so yes
1 Like

Dear @smth,

This is the first time I write PyTorch for a multi-class classification problem, though I wrote numerous notebooks for Binary classification using PyTorch, I seem to have a problem with the loss function.
Do you mind taking a look?
https://www.kaggle.com/solomonk/pytorch-speech-recognition-challenge-wip

Few questions:
What is the recommended way to encode the labels for a multi-class problem?
I used

df_pred['label'] = LabelEncoder().fit_transform(df_pred['label-str'])

(I then save this as a CSV which can be read later)

At present I am doing this outside my custom dataset class, is there any reference implementation for doing this inside my GenericImageDataset?

Thanks,

In case of multi-class, and if you use Sigmoid + BCELoss, then you need target to be one-hot encoding, i.e. something like this per sample: [0 1 0 0 0 1 0 0 1 0], were 1 is the locations of classes present

3 Likes

Thanks @smth
I tried what you suggested. Inside class GenericImageDataset(Dataset):, I read the column tmp_df[1] from the CSV file which represents the multi-class label, and then I tried both using one-hot encoding and a self.mlb = MultiLabelBinarizer() however in both cases, training does not seem to work.

When using the MultiLabelBinarizer(), torch complains that:
ValueError: Target and input must have the same number of elements. target nelement (160) != input nelement (16)

Unless I do this:
self.y_train=self.y_train.reshape((self.y_train.shape[0]*10,1)) # Must be reshaped for PyTorch!

Why is this happenning? In any case, training does not seem to converge even after fixing this issue.

When I use one-hot encoding, I dont even get to the training phase, as torch comlains that it can not read the key “down” which is one of the lables.

I uploaded the full rendered notebook here:
https://github.com/QuantScientist/Deep-Learning-Boot-Camp/blob/master/Kaggle-PyTorch/tf/PyTorch%20Speech%20Recognition%20Challenge%20Starter.ipynb

To make this clear: what should be the return value of self.y_train?
With multi label binarizer get:

INFO:__main__:y_train [[ 1.  0.  0. ...,  0.  0.  0.]
 [ 1.  0.  0. ...,  0.  0.  0.]
 [ 1.  0.  0. ...,  0.  0.  0.]
 ..., 
 [ 0.  0.  0. ...,  0.  0.  1.]
 [ 0.  0.  0. ...,  0.  0.  1.]
 [ 0.  0.  0. ...,  0.  0.  1.]]

Any help would be appreciated,

1 Like