Custom loss functions

That helps, thanks a bunch!


I have a CNN architecture as follows:


Conv1: (3, 32, 5, 1, 0)
Conv2: (32, 64, 5, 1, 0)
Conv3: (64, 128, 5, 1, 0)
Conv4: (128, 256, 5, 1, 0)
And output layer as convolutional layer itself.
Conv5: (256, 10, *, 1, 0)

All convolutional layers are customized with torch.autograd.function i.e. they have forward and backward defined in it.

I am using two loss functions:

Class my_Loss_func(torch.nn.Module):
           return loss1

In training loop:

Loss1 = my_Loss_func (output of conv4, labels)
Loss2 = torch.nn.CrossEntropyLoss(final output, labels)
loss = Loss1 + Loss2

In doing so, I think backward pass would still execute but wrongly. Because back propagation is happening twice (override?) through conv4 i.e. once during Loss1 and the other time during Loss2 as such they are added. (So, update will take place twice as well ?)

What one wants is, first conv4 should be updated(only once after back propagating once) then conv3 , then …conv1.

Your code snippet is a bit unclear, so I’m not completely sure what your use case is.
However, the backward call will calculate gradients of both losses w.r.t. the parameters used to calculate these losses.
If some parameters were used in both loss calculations, the gradient will be accumulated for these parameters.

Actually, Loss1 is contrastive loss whose inputs are features from conv4 and labels. In addition to that, Loss2 is just the cross entropy loss whose inputs are output of the network after conv5 and labels.

Task: Image classification

So, is it correct if I say :

  1. The parameters of conv4 will get updated twice, once according to Loss1 and the second time according to Loss2 ?

  2. The parameters of layers other than conv4 will also get updated according to Loss1 ?

I am guessing 1. should happen and 2. shouldn’t.
What do your opinion?

  1. No, the parameters will get updated in the optimizer.step() call. The gradients of parameters of reused modules will get accumulated, if the corresponding computation graph uses them.

A small illustration of my last post:
Assuming your model architecture is:

input -> conv1 -> conv2 -> conv3 -> conv4 -> conv5 -> output -> loss2
                                           \-> conv4_output -> loss1

If this is the workflow of the loss calculations, then loss1.backward() will accumulate gradients for the parameters in conv1,2,3,4, while loss2.backward() will accumulate gradients for the parameters in conv1,2,3,4,5.
The same applies for the sum of both losses.

Hello ptrblck,

I an trying to create a custom loss function in CNN for regression. The input is a binary image (600x600) which the background is black, and foreground is white. The ground truth associated with each input is an image with color range from 0 t 255 which is normalized between 0 and 1.

x =Input, ground truth=y and predicted output=y_hat

I tried to penalize the foreground by custom loss function below, but it didn’t improve the result. I am wondering whether my idea is right or not, if yes what’s wrong with my custom function?

mse = nn.MSELoss(reduction=‘mean’)

def criterion(y, y_hat, x, loss, weight=0.1):
y_hat_modified = torch.where(x[:,0:1]==1, weight*y_hat,y_hat) # x[:,0:1] is input
return loss(y,y_hat_modified)

I created a topic for it and you can see more detailed info there.
custom loss function for regression in cnn

Yes, that’s where I’m confused.

So, for all the parameters of conv1,2,3,4, will there be 2 values of gradients stored in .grad? or only one value will be there in .grad because of override?

optimizer.step() updates all the parameters based on parameter.grad. So, I doubt if both the gradients in .grad will be used for update or maybe they are added and then… I don’t know.

There will be one .grad value containing the sum of the gradients calculated during the backward passes.

Alright , thanks for the explanation :hugs:

so how is backward method inherited for custom functions? In case I have something more complicated like this:

from torch.nn.modules.loss import _Loss
class GaussianLoss(_Loss):
      def __init__(self, sigma=None, abs_loss=None):
          super(GaussianLoss, self).__init__()
          assert sigma is not None
          assert abs_loss is not None

      def forward(self, d):
          gaussian_val = torch.exp((-d).div(self.sigma))
          return gaussian_val

In other words, does the autograd know how to take derivative of exp(-d/sigma) wrt d (which is -d/sigma exp(-d/sigma) btw) ?

Yes, Autograd will be able to backpropagate all PyTorch functions, if they have a valid .grad_fn (the majority of PyTorch ops has it unless the operation is not differentiable).

1 Like

Thanks, so as long as I’m using torch.exp and _Loss base class, it should work fine?

You don’t need to use _Loss as the base class, but can use nn.Module instead.

Is this because ‘reduce’ was deprecated in favor of ‘reduction’?

You are not using reduce or reduction anywhere in your code and just store the sigma value in the class, so I’m unsure why you would need the _Loss base class.

I guess I just assumed it’d a loss function’s base class that they all must inherit, since it’s what I saw in torch.nn.modules.loss

You could still derive from _Loss, if you want to set the reduction parameter using the legacy checks as seen here.
This would also mean that you would call:

super(MSELoss, self).__init__(size_average, reduce, reduction)

in the __init__ method and could use self.reduction in the forward.
However, if you don’t need the reduction argument, you can just use nn.Module.
In fact, you could even use a plain Object class, as no parameters or buffers are registered in your custom loss function.


Hi,@ptrblck,ptrblck,could you answer some questions about custom loss funtion ? I use a autoencoder to recontruct a signal,input:x,output:y,autoencoder is made by CNN,I wanted to change the weights of the autoencoder,that mean I must change the weights in the autoencoder.parameters() .I made a custom loss function using numpy and scipy ,but I don’t know how to write backward function about the weight of autoencoder .Here is my loss function.If you know how to write it,please tell me,it is great matter to me.Thank you!

class autoencoderlossFuction(Function):
def forward(self, x, y, M, T):
x = x.detach().numpy()
x_std = np.std(x)
x_mean = np.mean(x)
x = (x-x_mean)/x_std

    y = y.detach().numpy()
    y_std = np.std(y)
    y_mean = np.mean(y)
    y = (y-y_mean)/y_std
    N = len(x)
    XmT = np.zeros((N, M+1))
    YmT = np.zeros((N, M+1))
    for m in range(M+1):
        XmT[m*T:,m] = x[0:N-m*T]
    for m in range(M+1):
        YmT[m*T:,m] = y[0:N-m*T]
    self.save_for_backward(x, y) 
    ckx = np.sum(np.multiply(,1),,1)))/(np.sum(np.multiply(x,x)))**(M+1) 
    cky = np.sum(np.multiply(,1),,1)))/(np.sum(np.multiply(y,y)))**(M+1)
    ckloss = 1/(cky-ckx)**2
    x = torch.tensor(x)
    y = torch.tensor(y)
    loss = torch.Tensor(ckloss)
    return loss
#question ???
def backward(self, grad_loss):

    grad_output = grad_output.detach().numpy()
    x, y = self.saved_tensors
    x = x.numpy()
    y = y.numpy()
    grad_input =  
    return torch.Tensor(grad_input)

class autoencoderloss(nn.Module):
def init(self,M,T):
self.M = M
self.T = T
def forward(self,x,y):
output = autoencoderlossFuction(x, y, self.M, self.T)
return output

Based on the provided code snippet I think you could replace all numpy operations with their PyTorch equivalent, which would automatically create the backward pass for you so that you don’t have to manually implement it.

I didn’t offer all the code,some codes are using scipy ,so I must wirte backward function