I am pretty new to deep learning and pytorch API , When I try to build a ResNet 50 and train the image attributes with binary (1 or -1) class ,It gives me nll_loss, in crossEntropyloss ,Target -1 is out of bounds error , and I have followed the tutorial that my resNet 50 structure is correctly built ,tested with a random input and it give me the correct output which is tensor size 2.
Here is my code for training -
def train_model(epoch):
model_net.train()
for batch_index,(input,labels) in enumerate(train_loader):
labels=labels.view(batch_size)
input,labels=input.to(device),labels.to(device)
outputs=model_net(input)
print(input.shape)
print(outputs.shape)
print(labels.shape)
loss =lostFunction(outputs,labels)
if batch_index %2==0 or batch_index==len(train_loader)-1:
print(‘epoch {} batch {}/{} loss {:.3f}’.format(
epoch, batch_index, len(train_loader)-1, loss.item()))
optimizer.zero_grad() # Set gradients to zero
loss.backward() # From the loss we compute the new gradients
optimizer.step()
this is the output ->
torch.Size([20, 3, 218, 178])
torch.Size([20, 2])
torch.Size([20])
IndexError Traceback (most recent call last)
in ()
----> 1 train_model(0)
4 frames
/usr/local/lib/python3.6/dist-packages/torch/nn/functional.py in nll_loss(input, target, weight, size_average, ignore_index, reduce, reduction)
2216 .format(input.size(0), target.size(0)))
2217 if dim == 2:
-> 2218 ret = torch._C._nn.nll_loss(input, target, weight, _Reduction.get_enum(reduction), ignore_index)
2219 elif dim == 4:
2220 ret = torch._C._nn.nll_loss2d(input, target, weight, _Reduction.get_enum(reduction), ignore_index)
IndexError: Target -1 is out of bounds.
this is the testing data format ->
(tensor([[[-0.6794, -0.7137, -0.8164, …, -1.0048, -1.0048, -0.9705],
[-0.6794, -0.7137, -0.8164, …, -1.0048, -1.0048, -0.9705],
[-0.6794, -0.7137, -0.8164, …, -1.0048, -1.0048, -0.9705],
…,
[ 0.3994, 0.2111, 0.0741, …, 1.6667, 1.6667, 1.4954],
[ 0.2453, 0.2453, 0.2967, …, 1.5639, 1.6838, 1.7694],
[ 0.2624, 0.2624, 0.3652, …, 1.5639, 1.6838, 1.7694]],
[[-0.6176, -0.6527, -0.7752, ..., -1.1604, -1.1604, -1.1253],
[-0.6176, -0.6527, -0.7752, ..., -1.1604, -1.1604, -1.1253],
[-0.6176, -0.6527, -0.7752, ..., -1.1604, -1.1604, -1.1253],
...,
[ 0.2052, 0.0126, -0.4601, ..., 1.8333, 1.8333, 1.6583],
[ 0.0476, 0.0476, -0.2500, ..., 1.7108, 1.8158, 1.9034],
[ 0.0476, 0.0476, -0.2150, ..., 1.7108, 1.8158, 1.9034]],
[[-0.5147, -0.5495, -0.6018, ..., -1.0550, -1.0550, -1.0201],
[-0.5147, -0.5495, -0.6018, ..., -1.0550, -1.0550, -1.0201],
[-0.5147, -0.5495, -0.6018, ..., -1.0550, -1.0550, -0.9853],
...,
[ 0.3219, 0.1302, -0.2881, ..., 2.1868, 2.2217, 2.0474],
[ 0.1476, 0.1476, -0.1138, ..., 2.0648, 2.2217, 2.3088],
[ 0.1476, 0.1476, -0.0615, ..., 2.0648, 2.2217, 2.3088]]]), tensor([-1]))
I really cant figure out why this happen , I have checked that all the training data ,label tensor ,batch number ,output feature number are all correct , can someone help me to debug?