Is it okey to convert tensor to numpy and calculate the loss value, and convert that value to tensor and backpropagate?

That wonâ€™t work as you are detaching the computation graph by calling numpy operations.

Autograd wonâ€™t be able to keep record of these operations, so that you wonâ€™t be able to simply backpropagate.

If you need the numpy functions, you would need to implement your own `backward`

function and it should work again. Have a look at this tutorial for more information.

Basically you autograd will track all operations as long as you stay in PyTorch land. Could you check, if your numpy functions are available in PyTorch?

Thank you for response.

I wanted to perform neural style transfer, however instead of content and style loss functions, iâ€™ve intended to use a loss that has distance transform in it. Numpy function is here: Distance transform, but it seems PyTorch doesnâ€™t have it yet.

Iâ€™ll try the tutorial though.

I also implement custom loss with numpy. The custom loss doesnâ€™t have `backward`

function. But my model works. Why?

Your custom loss function using numpy should detach the loss from the computation graph, so that all PyTorch parameters, which were used before detaching wonâ€™t get a gradient.

Could you check it by printing the `grad`

attribute of some parameters after calling `backward`

?

Hi, Have you solved the problem? I also want to perform numpy to tensor loss function for the Neural Style Transfer. Thank you very much

Hi, Iâ€™ve followed this tutorial https://pytorch.org/docs/master/notes/extending.html , somehow it worked

Not sure if I should start a new threadâ€¦ Iâ€™m calculating a weight map for a segmentation task, that needs distance transform (itâ€™ll weight a CE loss). Everything is on PyTorch land except the weight map. In the end, this weight map is just numbers. Will that backpropagate?

(How does cross_entropy loss (torch._C._nn.nll_loss) use the weight argument after all?)

It should work, since your weight map would just scale the loss and thus the gradients.

Hi, what if we are adding some components to the loss. And that components are being computed after detaching.

L= pytorch loss + numpy loss

Resultant L is also a pytorch tensor.

My network performance is getting affected. Although it looks like numpy loss part will have no effects during training.

What can be the reason for that effect, please

There wonâ€™t be any effect, as you are adding a constant value to the loss.

How reproducible is this effect? E.g. are you seeing the training constantly affected by it for 10 different runs?

I have some doubts when we need to create a own backward function to include one external value in loss function. For example, to solve this we need create our backward function, but I do not understand how can I do this because when we create this type of function we need to return tensors with gradients, and when we convert by numpy we donâ€™t get gradients, am I rigth?

```
criteria= torch.nn.MSELoss()
outputs, latent_space= model(X)
latent_space= latent_space.cpu()
L_S_nump= latent_space.detach().numpy()
value,counts = np.unique(L_S_nump, return_counts=True)
norm_counts = counts / counts.sum()
entro = -(norm_counts * np.log2(norm_counts)).sum()
mse = criteria(O,E)
loss = mse + entro
```

Yes, if you are using numpy operations, Autograd wonâ€™t be able to track these operations and you would thus detach the computation graph. You could write a custom `autograd.Function`

and define your `backward`

method there as described in this tutorial.

However, based on your code snippet you could also replace the numpy functions (`np.unique`

, `sum()`

, `np.log2`

) with the PyTorch equivalents.

Thank you for reply, I will follow that tip.

Backprop to compute gradients of a, b, c, d with respect to loss

grad_y_pred = 2.0 * (y_pred - y)

grad_a = grad_y_pred.sum()

grad_b = (grad_y_pred * x).sum()

grad_c = (grad_y_pred * x ** 2).sum()

grad_d = (grad_y_pred * x ** 3).sum()

I am following this example Calculate backpass and forward using numpy for third degree polynomial . I am confused why we multiply 2.0 with (y_pred - y)?

Second question is while updating weight why this hyphen is showing?

a **-**= learning_rate * grad_a

If I am not wrong itâ€™s a square of (y_predict - y). so 2.0 is power. we can also calculate it by multiplying it with 2 (y_predict-y) or np.square (y_predict - y)? If this is correct provide me the second answer why we use hyphen sign with variable?

- The
`2`

would be the factor used in the derivative of the square. - The â€śhypenâ€ť is an inplace subtraction, which will subtrace
`learning_rate * grad_a`

from`a`

. The out of place version would be:

```
a = a - (learning_rate * grad_a)
```