Leading to ValueError: Expected input batch_size (48) to match target batch_size (12). I would really be helped if the calculations are also explained. Thanks!
def train(n_epochs, loaders, model, optimizer, criterion, use_cuda, save_path):
"""returns trained model"""
# initialize tracker for minimum validation loss
valid_loss_min = np.Inf
for epoch in range(1, n_epochs+1):
# initialize variables to monitor training and validation loss
train_loss = 0.0
valid_loss = 0.0
###################
# train the model #
###################
model.train()
for batch_idx, (data, target) in enumerate(loaders['train']):
# move to GPU
if use_cuda:
data, target = data.cuda(), target.cuda()
## find the loss and update the model parameters accordingly
## record the average training loss, using something like
## train_loss = train_loss + ((1 / (batch_idx + 1)) * (loss.data - train_loss))
## Gradient Clearnece for Variables ALready Optimized
optimizer.zero_grad()
## Forward Pass
output = model(data)
print(output.shape) # <<<<<< PRINTEDHERE
## Calculate the Batch Loss
loss = criterion(output, target)
## Backward Pass
loss.backward()
## Optimization Step (1)
optimizer.step()
## Traing Loss Recalculation - as above
train_loss = train_loss + ((1 / (batch_idx + 1)) * (loss.data - train_loss))
######################
# validate the model #
######################
model.eval()
for batch_idx, (data, target) in enumerate(loaders['valid']):
# move to GPU
if use_cuda:
data, target = data.cuda(), target.cuda()
## update the average validation loss
## forward pass
output = model(data)
## Calculate the Batch Loss
loss = criterion(output, target)
## Validation Loss Calculation
valid_loss = valid_loss + ((1 / (batch_idx + 1)) * (loss.data - valid_loss))
# print training/validation statistics
print('Epoch: {} \tTraining Loss: {:.6f} \tValidation Loss: {:.6f}'.format(
epoch,
train_loss,
valid_loss
))
## TODO: save the model if validation loss has decreased
if valid_loss <= valid_loss_min:
print('Validation loss decreased ({:.6f} --> {:.6f}). Saving model ...'.format(
valid_loss_min,
valid_loss))
torch.save(model.state_dict(), save_path)
valid_loss_min = valid_loss
# return trained model
return model
# train the model
model_scratch = train(12, loaders_scratch, model_scratch, optimizer_scratch,
criterion_scratch, use_cuda, 'model_scratch.pt')
# load the model that got the best validation accuracy
model_scratch.load_state_dict(torch.load('model_scratch.pt'))
The full stack trace:
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-28-7e7db3f74a6e> in <module>
83
84 # train the model
---> 85 model_scratch = train(60, loaders_scratch, model_scratch, optimizer_scratch,
86 criterion_scratch, use_cuda, 'model_scratch.pt')
87
<ipython-input-28-7e7db3f74a6e> in train(n_epochs, loaders, model, optimizer, criterion, use_cuda, save_path)
33
34 ## Calculate the Batch Loss
---> 35 loss = criterion(output, target)
36
37 ## Backward Pass
~\.conda\envs\deep-learning\lib\site-packages\torch\nn\modules\module.py in __call__(self, *input, **kwargs)
548 result = self._slow_forward(*input, **kwargs)
549 else:
--> 550 result = self.forward(*input, **kwargs)
551 for hook in self._forward_hooks.values():
552 hook_result = hook(self, input, result)
~\.conda\envs\deep-learning\lib\site-packages\torch\nn\modules\loss.py in forward(self, input, target)
929
930 def forward(self, input, target):
--> 931 return F.cross_entropy(input, target, weight=self.weight,
932 ignore_index=self.ignore_index, reduction=self.reduction)
933
~\.conda\envs\deep-learning\lib\site-packages\torch\nn\functional.py in cross_entropy(input, target, weight, size_average, ignore_index, reduce, reduction)
2315 if size_average is not None or reduce is not None:
2316 reduction = _Reduction.legacy_get_string(size_average, reduce)
-> 2317 return nll_loss(log_softmax(input, 1), target, weight, None, ignore_index, None, reduction)
2318
2319
~\.conda\envs\deep-learning\lib\site-packages\torch\nn\functional.py in nll_loss(input, target, weight, size_average, ignore_index, reduce, reduction)
2110
2111 if input.size(0) != target.size(0):
-> 2112 raise ValueError('Expected input batch_size ({}) to match target batch_size ({}).'
2113 .format(input.size(0), target.size(0)))
2114 if dim == 2:
ValueError: Expected input batch_size (48) to match target batch_size (12).
As I said, the issue is not from model. You have set the batch-size=48 but strangely, the target variable has different batch-size=12. The reason may lead to this, is the loader and probably made some mistakes there.
Try to run an iteration over dataloader so you can make sure the number of tensors are correct.
Something like this:
next(iter(train_loader))
PS. If you are doing Udacity’s assignment, I highly suggest going through all issues yourself as it helps you to learn how to trace back error and understand the common issues. The reason that I am mentioning this, is that I have seen codes very similar to you in this forum and I think this is not the real intention of learning. (possible implemented assignment)
Hi, you are correct - I am trying to do the assignment - but badly stuck at this model implementation :(. Nevertheless, thanks a lot for the pointer. I will research bit more and try to fix it.
One point that I need to mention is that instead of writing codes for all sections then running model to find bugs, build small parts of code, such as only one layer or just one Dataset/DataLoader and try to interact with them for arbitrary inputs using basic function in python/PyTorch. For instance, you need to be able to extract only 1 batch of images and show them or play with lmost anything in your code. Literally, you must be able to understand the mechanism of each module you use such as model.train(), model.eval(), etc.