Backward() for costume loss function

netaglazer · January 29, 2020, 9:04am

hi,
i created a loss function, and it returns a float
i get the error ‘float’ object has no attribute ‘backward’
which is makes sense…

if i return tensor(some_float)
i get the error:
element 0 of tensors does not require grad and does not have a grad_fn
how can i handle it?

G.M · January 29, 2020, 9:51am

U have to return a FloatTensor that is the result of some operation of ur loss function, so ur loss function won’t break the gradients graph. To be more specific, can u provide an example code of ur loss function?

netaglazer · January 30, 2020, 10:18am

the function:

def loss_function(p, y):
losses = 0
grad_loss =
for i in range(len(p)):
temp_p = torch.tensor(p[i])
temp_y = torch.tensor(y[i])
k = len(temp_y)
loss1 = -tensot_mul(vec_minus(one_Vector(k),y[i]), list(np.log(vec_minus(one_Vector(k),p[i]))))
l = y[i]*p[i].log()
prod = vec_minus(vec_minus(one_Vector(k),y[i]), - l)
loss2 = loss1 + sum(softmax(vec_minus(one_Vector(k),prod)))
losses = losses + loss2
returnfloat(losses/(len(p)))

p - prediction
y - target

G.M · January 30, 2020, 10:29am

As I said, I think in this code u r returning a Python float object, and that doesn’t have a .backward() method.
I’m unclear about what the tensor_mul, vec_minus are, but the operations u apply to the outputs should all the Pytorch operations, involving some numpy operations would probably break the graph(I’m not sure).
If u return a FloatTensor like u described. The tensor would be a “Leaf variable”(I’m not sure what exactly it should be called), and these kind of variable does not have gradients because u created them out of a float, which holds no gradients information.

netaglazer · January 30, 2020, 11:32am

im sorry, it didnt work…

i changed the function, and i also tried to add Variable()

def loss_function(p, y, b):
losses = 0
for i in range(b):
temp_p = torch.tensor(p[i])
temp_y = torch.tensor(y[i])
k = len(temp_y)
loss1 = -(torch.ones(k).cuda()-temp_y)@((torch.ones(k).cuda()-temp_p).log())
l = temp_y*(temp_p.log())
prod = (torch.ones(k).cuda()-temp_y) - l
loss2 = loss1 + torch.max(torch.ones(k).cuda()-prod)
losses = loss1 + loss2
f = (losses/(b))
return Variable(torch.FloatTensor([f]))

i still get:

element 0 of tensors does not require grad and does not have a grad_fn

G.M · January 30, 2020, 1:06pm

what is the type of “p” and “y”?

netaglazer · January 30, 2020, 1:07pm

p,y - tensor
b - int (batch size)

G.M · January 30, 2020, 2:04pm

can u reformat the code? I’m having trouble distinguishing what is inside the loop.

tux · January 30, 2020, 2:08pm

Are you sure you have set tensor.requires_grad()=True ?

netaglazer · January 30, 2020, 2:12pm

def loss_function(p, y, b):
—losses = 0
—for i in range(b):
------temp_p = torch.tensor(p[i])
------temp_y = torch.tensor(y[i])
------k = len(temp_y)
------loss1 = -(torch.ones(k).cuda()- temp_y)@((torch.ones(k).cuda()-temp_p).log())
------l = temp_y*(temp_p.log())
------prod = (torch.ones(k).cuda()-temp_y) - l
------loss2 = loss1 + torch.max(torch.ones(k).cuda()-
prod)
------losses = loss1 + loss2
—f = (losses/(b))
—return Variable(torch.FloatTensor([f]))

netaglazer · January 30, 2020, 2:14pm

Where should i define it?

tux · January 30, 2020, 2:16pm

I would say there :

temp_p = torch.tensor(p[i], requires_grad=True)
temp_y = torch.tensor(y[i], requires_grad=True)

By default, the requires_grad are set to False. You need to activate them yourself. More on this here : Autograd mechanics — PyTorch 2.1 documentation

tux · January 30, 2020, 2:19pm

May be you should do it even before … At the beginning of your function I would try:

p = torch.tensor(p, requires_grad=True)
y = torch.tensor(y, requires_grad=True)

netaglazer · January 30, 2020, 2:22pm

i get
Only Tensors of floating point dtype can require gradients

G.M · January 30, 2020, 2:23pm

Variable is now deprecated.
When applying the tensor to some linear transformation with scalers, u can just use Python value. For example a: tc.Tensor = 1 - b
Using torch.tensor(OLD_TENSOR) will not pass the gradients of the OLD_TENSOR to the new tensor.

Note: u should read more about pytorch docs and the autograd mechanism.

netaglazer · January 30, 2020, 2:37pm

ok
tnx for your note, i will do it

right now, i have this

def loss_function(p, y, b):
    losses = 0
    for i in range(b):
        k = len(y[i])
        loss1 = -(torch.ones(k).cuda()-y[i])@((torch.ones(k).cuda()-p[i]).log())
        l = y[i]*(p[i].log())
        prod = (torch.ones(k).cuda()-y[i]) - l
        loss2 = loss1 + torch.max(torch.ones(k).cuda()-prod)
        losses = loss1 + loss2
    return losses/(b)

so im not doing this: torch.tensor(OLD_TENSOR) anymore

G.M · January 31, 2020, 2:37am

Just to note: U can change torch.ones(k).cuda() - TEN to 1 - TEN. Also, ur loss function looks like cross entropy loss to me, and Pytorch has a implementation for that.

netaglazer · January 31, 2020, 11:07am

it is not cross entropy,
it is a loss for a multilabel problem, that i want to minimize when the model correct in one lable of all the labels of an object in my data