I am currently working on a transfer learning problem with a resnet-50. Below is my code for the training. It seems to be working, but the accuracy goes from 0 to 1 in second batch and then stays at 1 for the remaining batches and epochs.
import time
epochs = 3
device = torch.device("cuda:0")
# Define Optimizer and Loss Function
loss_func = nn.NLLLoss()
optimizer = torch.optim.Adam(res50.parameters())
for epoch in range(epochs):
epoch_start = time.time()
print("Epoch: {}/{}".format(epoch+1, epochs))
# Set to training mode
res50.train()
# Loss and Accuracy within the epoch
train_loss = 0.0
train_acc = 0.0
valid_loss = 0.0
valid_acc = 0.0
for i, (inputs, labels) in enumerate(train_data):
inputs = inputs.to(device)
labels = labels.to(device)
# Clean existing gradients
optimizer.zero_grad()
# Forward pass - compute outputs on input data using the model
outputs = res50(inputs)
# Compute loss
loss = loss_func(outputs, labels)
# Backpropagate the gradients
loss.backward()
# Update the parameters
optimizer.step()
# Compute the total loss for the batch and add it to train_loss
train_loss += loss.item() * inputs.size(0)
# Compute the accuracy
ret, predictions = torch.max(outputs.data, 1)
correct_counts = predictions.eq(labels.data.view_as(predictions))
# Convert correct_counts to float and then compute the mean
acc = torch.mean(correct_counts.type(torch.FloatTensor))
# Compute total accuracy in the whole batch and add to train_acc
train_acc += acc.item() * inputs.size(0)
print("Batch number: {:03d}, Training: Loss: {:.4f}, Accuracy: {:.4f}".format(i, loss.item(), acc.item()))
Below is an example output for first five batches. I am not sure if I am calculating accuracy incorrectly?
Epoch: 1/3
Batch number: 000, Training: Loss: 2.2015, Accuracy: 0.0000
Batch number: 001, Training: Loss: 0.1964, Accuracy: 1.0000
Batch number: 002, Training: Loss: 0.0162, Accuracy: 1.0000
Batch number: 003, Training: Loss: 0.0013, Accuracy: 1.0000
Batch number: 004, Training: Loss: 0.0001, Accuracy: 1.0000
I do have a theory it has to do with classes vs targets. I had to set the dataset targets since I am working with custom dataset. When I do this, the train_data.dataset.targets outputs [2,1,2,3,…,5] according to the class but, the train_data.dataset.classes outputs [’.ipynb_checkpoints’, ‘1’]. I think this is the problem because when I print out labels in nested for loop it is all 1.
Thanks in advance for any help!