RuntimeError: there are no graph nodes that require computing gradients ,CosineEmbeddingLoss

isalirezag · November 19, 2018, 12:56am

can anyone help me with this loss error.
why i am getting the following error when i use CosineEmbeddingLoss loss

example:

import torch
import torch.nn.functional as F
from torch.autograd import Variable
a = torch.rand(1,1,10,10)
b = torch.rand(1,1,10,10)
c = torch.ones(1,1,10,10)
c.requires_grad = False
l = torch.nn.CosineEmbeddingLoss()
output = l(a, b, c)
print(output)

Variable containing:
1.00000e-07 *
  7.7486
[torch.FloatTensor of size 1]

output.backward()

RuntimeError: there are no graph nodes that require computing gradients

Im using pytorch 0.3.0

I have two follow up questions as well:
why i cannot compute the loss if i have:

a = torch.rand(1,2,10,10)
b = torch.rand(1,2,10,10)
c = torch.ones(1,2,10,10)
c.requires_grad = False
l = torch.nn.CosineEmbeddingLoss()
output = l(a, b, c)
print(output)

RuntimeError: inconsistent tensor size, expected src [1 x 10 x 10] and mask [1 x 2 x 10 x 10] to have the same number of elements, but got 100 and 200 elements respectively at /opt/conda/conda-bld/pytorch_1512387374934/work/torch/lib/TH/generic/THTensorMath.c:197

and why i can do it when i have:

a = torch.rand(2,1,10,10)
b = torch.rand(2,1,10,10)
c = torch.ones(2,1,10,10)
c.requires_grad = False
l = torch.nn.CosineEmbeddingLoss()
output = l(a, b, c)
print(output)
1.00000e-07 *
  8.9407
[torch.FloatTensor of size 1]

in the last case i still get the

RuntimeError: there are no graph nodes that require computing gradients

error when i do the backward pass though

crcrpar · November 19, 2018, 1:28am

Hi,

I’m not sure that I understand this loss function and how to use it though, as long as I understand,
a target variable (in your snippet, c) is assumed to be shape of (batch_size,) and representing whether every single sample in x1 is similar to the counterpart in x2.
For example, in your second case which failed to calculate a loss, c was supposed to represent one sample of a shape of (2, 10, 10) is similar to the counterpart of b shape of (2, 10, 10) or not.

Also, I guess the reason why backward computations are unavailable is that inputs to the loss function did not go through any operations.

For example, the loss computed using 2 vectors from embedding layer can propagate gradient like this.

In [9]: l = nn.Embedding(10, 20)

In [10]: a = l(torch.LongTensor([0]))

In [11]: a.size()
Out[11]: torch.Size([1, 20])

In [12]: b = l(torch.LongTensor([2]))

In [13]: loss = F.cosine_embedding_loss(a, b, torch.ones(1, requires_grad=False))

In [14]: loss.backward()

In [15]: exit

Hope this will help you a bit.

isalirezag · November 19, 2018, 1:42am

Thanks for your explanation, but this loss will be used in unsupervised learning or semi supervised learning.
so c is not the target.
based on the documentation link, the c is a Tensor label 1 or -1. This is used for measuring whether two inputs are similar or dissimilar; my understanding was that it should be in the similar shape with a and b. though, if i change its size to be

c = torch.ones(1,1,10,10)

the second case witll work apparently.
But i still did not understand what is happening there.