Cost function backward error

mderakhshani · April 22, 2017, 9:15am

Is it possible to forward a batch of images, let say 64 images, through a network and backward image by image? Here is my code:

def train(epoch):
    global steps
    global s
    global optimizer
    epochLoss = 0
    for index, (images, labels) in enumerate(trainLoader):
          if s in steps:
                learning_rate = learning_rate * 0.1
                optimizer = optim.SGD(net.parameters(), lr=learning_rate, momentum=momentum, weight_decay=decay)
          if cuda:
          images = images.cuda()
          images = V(images)
          optimizer.zero_grad()
          output = net(images).cpu()  # 64*95*7*7
          loss = 0
          for ind in range(images.size()[0]):    # images.size()[0] = 64
               target = V(jsonToTensor(labels[ind]))
               cost = criterion(output[ind,:,:,:].unsqueeze(0), target)
               loss += cost.data[0]
               cost.backward(retain_variables=True)     <---- Error Occurres here!
         epochLoss += loss
         optimizer.step()
         print("(%d,%d) -> Current Batch Loss: %f"%(epoch, index, loss))
         s = s + 1
    losses.append(len(epochLoss), epochLoss)

In above code, criterion is my customized cost function which gets two tensors as input. i have tried above code but I received an error like this:

RuntimeError: inconsistent tensor size at /py/conda-bld/pytorch_1490979338030/work/torch/lib/TH/generic/THTensorMath.c:827

could you please tell me what is the problem? How can i solve this problem?

colesbury · April 23, 2017, 12:28am

Yes, you should be able to do that. It looks one of your sizes doesn’t match up, but it’s hard to tell where from your snippet. Can you post a link to a full working example?

Here’s a simple snippet showing multiple calls to backards:

import torch
a = torch.autograd.Variable(torch.randn(5, 5), requires_grad=True)
b = torch.autograd.Variable(torch.randn(5, 5), requires_grad=True)
c = a @ b

for i in range(5):
  cost = c[i,:].sum()
  cost.backward(retain_variables=True)

mderakhshani · April 23, 2017, 3:34am

Thanks for your response @colesbury ! Actually I forward 64 images at one time. So the size of the output is 649577. When I want to backward, I don’t backward based on whole batch at one time (Because of the complexity of my cost function- it is a little hard to make it parallel). Actually it is image by image. In another word, In my point of view, what backward function expects is a gradient tensor with size 649577 as same as output size. So I think I should make such a tensor. Actually It is my belief. Am I right?

So in backward function, I try to make a tensor with above size and backward it!