Forward and derivative functions for error criterion

I have some functions that require storing error values at a current layer as they are passed through. For example if the nonlinearity of a layer is sigmoid the line of code looks like:

grad_output[0].detach()*(Value + sigmoidPrime(F.sigmoid(self.totalOut)))

and I have a defined

def sigmoidPrime(x):
    return x * (1-x)

I am looking to do the same thing for the top layer which would require functions for the actual forward and backward steps of what is happening within loss functions. Questions 1 is if there is any generic way to do this or a place I can look within the pytoch definitions for these functions. 2 is, if not, if anyone can help me write what these might look like for some common loss functions like MSE and CrossEntropy?

def crossEntropyForward(x):
def crossEntropyPrime(x):


I’m not sure what sigmoidPrime is supposed to represent here?
Also not sure what the grad_output is in your formula above. Can you share more context?

Ah yes sorry, this code is in a register backward hook function

def saveAverageD(self, grad_input, grad_output):

Sigmoid prime is the derivative of the sigmoid function to calculate the gradient during the backward pass if autograd wasn’t doing it already under the hood.

The module just has a conv2D or Linear layer and then a sigmoid for the earlier layers and it seems to be doing what I want it to do. But at the top layer where it is just a Linear into the loss function, I do not do this step and just use grad_output it does not appear to be working.

I’m still unsure what your goal here is :smiley:

But if you want the forward and backward functions, you just need to differentiate the functions.
For example, you have mse that does this in the forward: mse(x, y) = (x - y).norm(2).mean()
Then you can differentiate wrt to x and do mse_backward_x(grad_out, x, y) = 2/N * grad_out.expand_as(x) * (x - y)

1 Like