Hi all,
I’m attempting to train a network on a large dataset of 64 x 64 images (~100,000 in total) using a simple ReLu model. I’m very new to both neural networks and PyTorch in general, so please bear with me.
My network appears to initialize and evaluate properly – my error arises when I attempt to call the nll_loss function in my validation method.
Here is the structure of my network:
# create neural net
import torch.nn as nn
import torch.optim as optim
class Net(nn.Module):
def __init__(self):
super(Net, self).__init__()
self.fc1 = nn.Linear(64, 100)
self.fc2 = nn.Linear(100, 50)
self.fc3 = nn.Linear(50, 10)
def forward(self, x):
# x = x.view(x.size(0), -1)
x = F.relu(self.fc1(x))
x = F.relu(self.fc2(x))
x = self.fc3(x)
return F.log_softmax(x)
And here is my validation method:
#validate net
def validation():
network.eval()
validation_loss = 0
correct = 0
with torch.no_grad():
for data, target in validation_loader:
output = network(data)
validation_loss += F.nll_loss(output, target, size_average=False).item() #ERROR IS HERE, ISSUE WITH TARGET DIMENSION
pred = output.data.max(1, keepdim=True)[1]
correct += pred.eq(target.data.view_as(pred)).sum()
validation_loss /= len(validation_loader.dataset)
validation_losses.append(validation_loss)
print('\nValidation set: Avg. loss: {:.4f}, Accuracy: {}/{} ({:.0f}%)\n'.format(
validation_loss, correct, len(validation_loader.dataset),
100. * correct / len(validation_loader.dataset)))
The precise details of the error I get are:
RuntimeError Traceback (most recent call last)
<ipython-input-17-fe7e33c67fb7> in <module>()
----> 1 validation()
2 for epoch in range(n_epochs):
3 train(epoch)
4 validation()
5 test()
1 frames
/usr/local/lib/python3.6/dist-packages/torch/nn/functional.py in nll_loss(input, target, weight, size_average, ignore_index, reduce, reduction)
1871 ret = torch._C._nn.nll_loss(input, target, weight, _Reduction.get_enum(reduction), ignore_index)
1872 elif dim == 4:
-> 1873 ret = torch._C._nn.nll_loss2d(input, target, weight, _Reduction.get_enum(reduction), ignore_index)
1874 else:
1875 # dim == 3 or dim > 4
RuntimeError: invalid argument 3: only batches of spatial targets supported (3D tensors) but got targets of dimension: 1 at /pytorch/aten/src/THNN/generic/SpatialClassNLLCriterion.c:59
I checked the dimensions of my data and target variables, and they are [4096, 3, 64, 64] and [4096], respectively. I can see that my target variable is supposed to be 3D instead of 1D, but I’m at a loss for how to change it. In the tutorials I’ve followed for setting up this type of loss computation there was little in the way of transforming or augmenting these sorts of things.
If any of you have any tips or ideas on what’s happening here, I would really appreciate it.
Thanks!