Hi,
I wonder why the training loss does not decrease?
The input image batch has the shape of: [batch_size1200*200]. The true labels are in “one hot” form: [1,0,0], [0,1,0] and [0,0,1].
I have only 5000 images, so I decided to start from transfer learning.
I loaded pretrained ResNet18 to classify my images into 3 categories.
I change the input channel of ResNet18 to 1, and final output dimension to 3, in order to suit my classification problem.
So the shape of the model output tensor is [batch_size * 3], and the shape of the true label tensor is [batch_size * 3].
Since PyTorch does not provide the CrossEntropy loss function between those two tensors, I wrote my own cross entropy loss function based on the equation:
loss = t.mean(-t.sum(target.float() * t.log(y_prediction),dim=1))
Also I am confused about the output of ResNet18: I read somewhere that ResNet18 has softmax layer before output, but the elements of final output did not add up to 1? So I added a softmax layer after ResNet18.
My training code
import torch as t
import torch.nn.functional as F
device = t.device('cuda:1' if t.cuda.is_available() else 'cpu')
optimizer = optim.SGD(model.parameters(), lr = 0.01, momentum = 0.9)
for images, target in train_dataloader:
images = images.float().to(device)
target = target.float().to(device)
optimizer.zero_grad()
y_prediction = my_resnet18_model(images)
y_prediction = F.softmax(y_prediction, dim = 1)
loss = t.mean(-t.sum(target * t.log(y_prediction), dim = 1)
loss.backward()
optimizer.step()
running_loss += loss.item() * images.shape[0]
epoch_loss = running_loss / len(train_dataset)
print('train loss: {}'.format(epoch_loss))
The epoch loss stayed around 1.09.
Am I using the loss function right?
Could I add the softmax after ResNet18?
How can I improve?
Any comments or suggestions are appreciated. Thanks in advance!