aklagoo
(Archan Lagoo)
September 14, 2019, 3:42am
1
I’m trying to train a model for image classification. However, the training just doesn’t take place.
class CNN(nn.Module):
def __init__(self):
super(CNN, self).__init__()
self.conv1 = nn.Conv2d(3, 64, kernel_size=11, stride=4, padding=2)
self.conv2 = nn.Conv2d(64, 192, kernel_size=7, padding=2)
self.conv3 = nn.Conv2d(192, 256, kernel_size=5, padding=1)
self.fc1 = nn.Linear(256*9*9, 4096)
self.fc2 = nn.Linear(4096, 1024)
self.fc3 = nn.Linear(1024, 102)
def forward(self,x):
x = self.conv1(x)
x = F.max_pool2d(x, kernel_size=3, stride=2)
x = self.conv2(x)
x = F.max_pool2d(x, kernel_size=3, stride=2)
x = self.conv3(x)
x = F.max_pool2d(x, kernel_size=3, stride=2)
x = x.view(x.shape[0], -1)
x = self.fc1(x)
x = F.relu(x, inplace=True)
x = self.fc2(x)
x = F.relu(x, inplace=True)
x = self.fc3(x)
x = F.log_softmax(x, dim=1)
return x
model = CNN()
criterion = nn.NLLLoss()
optimizer = optim.Adam(model.parameters())
model.to(device)
NUM_EPOCHS = 10
for epoch in range(NUM_EPOCHS):
train_loss = 0
valloss = 0
accuracy = 0
counter = 0
for inputs, labels in trainloader:
# Move to device
outputs= labels.to(device)
inputs = inputs.to(device)
# Get loss
optimizer.zero_grad()
preds = torch.exp(model(inputs))
loss = criterion(preds, outputs)
# Backprop
loss.backward()
optimizer.step()
train_loss += loss.item()*inputs.size(0)
# Validation
with torch.no_grad():
for inputs, labels in validationloader:
# Move to device
inputs, labels = inputs.to(device), labels.to(device)
output = model(inputs)
valloss = criterion(output, labels)
valloss += valloss.item()*inputs.size(0)
output = torch.exp(output)
# Calculate accuracy
top_p, top_class = output.topk(1, dim=1)
equals = top_class == labels.view(*top_class.shape)
accuracy += torch.mean(equals.type(torch.FloatTensor)).item()
# Calculate loss
train_loss = train_loss/len(trainloader.dataset)
valloss = valloss/len(validationloader.dataset)
# Print info
print('Accuracy: ', accuracy/len(validationloader))
print('Epoch: {} \tTraining Loss: {:.6f} \tValidation Loss: {:.6f}'.format(epoch, train_loss, valloss))
1 Like
mailcorahul
(Raghul Asokan)
September 14, 2019, 6:14am
2
What do you mean by training doesn’t take place?
Can you be more elaborate? Anything to do with the training/validation loss?
aklagoo
(Archan Lagoo)
September 14, 2019, 6:20am
3
The loss and the accuracy remains the same. The loss, in fact, is negative. The accuracy is close to 1.15%.
mailcorahul
(Raghul Asokan)
September 14, 2019, 7:19am
4
May I know the reason why you do torch.exp() after forward pass? Also what criterion are you using?
aklagoo
(Archan Lagoo)
September 14, 2019, 7:20am
5
NLLLoss with log_softmax inside
mailcorahul
(Raghul Asokan)
September 14, 2019, 7:27am
6
I have not used NLL loss before, but it says in the doc that a logsoftmax layer is necessary at the last layer of the network inorder to use NLL loss. Do you have it?
And why torch.exp() at the end?
aklagoo
(Archan Lagoo)
September 14, 2019, 9:07am
7
I’ve used a log_softmax layer inside my network. And from what I’ve seen from some tutorials, torch.exp is necessary to convert the log output to the probability
ptrblck
September 14, 2019, 10:53am
8
As @mailcorahul said, nn.NLLLoss
expects the log probabilities, so you shouldn’t apply torch.exp
on your output. Instead pass the F.log_softmax
outputs directly to your criterion.
If you need to see the softmax probabilities, you can of course use torch.exp
. Just don’t pass them to nn.NLLLoss
.
mailcorahul
(Raghul Asokan)
September 14, 2019, 11:07am
9
@aklagoo let us know if it works.
aklagoo
(Archan Lagoo)
September 14, 2019, 5:03pm
10
I changed the section and passed the predictions without torch.exp(). The model still doesn’t train, although I was wrong before. Even after multiple epochs, the loss and the accuracy hasn’t changed.
mailcorahul
(Raghul Asokan)
September 14, 2019, 5:07pm
11
can you post the complete code here(with network and training)?
2 Likes
mailcorahul
(Raghul Asokan)
September 14, 2019, 5:43pm
13
can you try defining optim after moving model to gpu?
model.to(device)
optimizer = optim.Adam(model.parameters())
aklagoo
(Archan Lagoo)
September 14, 2019, 5:47pm
14
The result is still the same. I checked and found out that the model weights are not being updated. I’ve modified the model and added a few ReLU activation layers. The weights are finally changing, although I’m not sure why that would be an issue.
mailcorahul
(Raghul Asokan)
September 14, 2019, 6:08pm
15
are you saying the weights were not updated before(optimizer and then moving to gpu), but are changing after swapping the statements?
aklagoo
(Archan Lagoo)
September 14, 2019, 6:25pm
16
No. Switching the statements did not work. I modified my model and now the weights are changing. The accuracy is still 3% but the loss is changing.
Are you sure you’re not experiencing exploding gradient/vanishing gradient problem?