I have a network that has two independent neurons at the output.

Each neuron has a function to activate tan.

I consider a mistake for the first neuron and for the second.

If I call backward () twice, I get an error.

How do i do right?

As the Loss function, I use L1Loss.

# Call backward () twice

I also can not figure out how to choose the right Loss function.

I have two neurons. They work independently and give values from -1 to 1.

For example, one neuron gives 0.3, and the network gives the correct answer is -0.4. Network error is 0.7. I have to reduce the gradient. But if I call the L1Loss function, then I will get 0.3 - (- 0.4) = 0.7. In this case, my gradients will be increased. How do I specify a network to reduce the gradient?

Although maybe I’m wrong, the network will do everything right …

The first question remains, how do I calculate the gradients for two neurons( loss1.backward(), loss2.backward() )

you can do for example `(loss1 + loss2).backward()`

or use `torch.autograd.grad`

function

every time after `loss.backward()`

is called, the previous computational graph is released.

Thus if you want use the graph again, just add `loss1.backward(retain_graph = True)`

to prevent the graph to be released.

And remember to reset `optimizer.zero_grad`

, before you call `optimizer.step()`

.

`loss.backward()`

will compute the gradient

and `optimizer.step()`

will apply the gradient and update the tensor.

Since you have two loss, you might need to be more careful about when to reset the grad and update

In the example you stated, after applying you get the error 0.7. And when you do a backward pass the network computes the gradients such that your error is reduced. I am not sure how you stated *the gradients will increase* in your example.

Also like smth said you don’t need to do backward twice, you can basically add those losses and directly compute backward on the total loss.

Assume the network issued on the first neuron 0.7, the correct answer is 0.2 error is 0.5. The second neuron produced 0.1, the correct answer is 0.5, the error is -0.4.

If you do (loss1 + loss2) .backward () -> (0.5 - 0.4 = 0.1) .backward (), how does the network know that for the first neuron you need to decrease by 0.5, and for the second to increase by 0.4?

At the moment I did like this

optimizer.zero_grad()

loss1.backward(retain_graph=True)

loss2.backward()

optimizer.step()

Maybe that’s how it will be right?

optimizer.zero_grad()

loss1.backward(retain_graph=True)

optimizer.step()

optimizer.zero_grad()

loss2.backward()

optimizer.step()

you sum the absolute values of the errors. so your total loss would be 0.5 + 0.4. so, to decrease the total loss it needs to decrease individual losses, thus the gradients are updated such that both the losses decrease simultaneously.

If above is the scenario of your problem, then you can add absolute value of the losses and do backward at once.

The weights for the first neuron need to be reduced by 0.5, and the weights for the second one increased by 0.4. If the errors are 0.5 and 0.4, does backward () do the right thing?

yes. The error would be zero only if the first decreases and the second increases. And the models goal always would be to achieve zero loss.

I think you do not understand me correctly.

If the output was one neuron and the error was 0.5, then backward () would do everything correctly.

But I need backward () to change each neuron correctly, first reduce the weight for the first neuron, while the second neuron will not change. Then he changed the weight for the second neuron. Is it possible?

I can not understand, is the error 0.4 and -0.4 the same?

Indeed, in the first case, the gradient needs to be reduced, and in the second one, should it be increased?

But if I do

abs (Loss)

then the network will not correctly change the gradient …

what does it mean for “the previous computational graph to be released” mean?