Backward of a custom layer crashes

AlbertZhang · January 20, 2018, 8:07am

Hi, I’m new to PyTorch. I implemented a custom function to perform Hadamard product of matrices as:

class HadamardProd(autograd.Function):
      #@staticmethod
      def forward(ctx, input, weight, bias=None):
            ctx.save_for_backward(input, weight, bias)
            output = torch.mul(input, weight)
            if bias is not None:
                output += bias
            return output

     #@staticmethod
     def backward(ctx, grad_output):
            input, weight, bias = ctx.saved_variables
            grad_input = grad_weight = grad_bias = None
            if ctx.needs_input_grad[0]:
              grad_input = torch.mul(grad_output, weight)
            if ctx.needs_input_grad[1]:
              grad_weight = torch.sum(grad_output * weight, 0)
            if bias is not None and ctx.needs_input_grad[2]:
              grad_bias = torch.sum(grad_output, 0)

            if bias is not None:
              return grad_input, grad_weight, grad_bias
            else:
              return grad_input, grad_weight

I used the autograd.gradcheck to check my gradient and it got true. But when I applied the corresponding layer of this function to my network, the loss.backward() got a TypeError that torch.mul received an invalid combination of arguments (torch.FloatTensor, Variable). I don’t how what’s wrong with the code.

And if I uncomment the line

#@staticmethod

I got an AttributeError that

'torch.FloatTensor' object has no atribute 'save_for_backward'

jpeg729 · January 20, 2018, 9:33am

First, HadamardProd inherits from nn.Function but the examples all inherit from either torch.nn.Module or torch.autograd.Function. Changing that might help

It looks like your module is doing simple calculations using pytorch operations. If that is case I would suggest leaving out the backward function. PyTorch will figure it out on its own. This should work. I also changed the inplace add to an ordinary add, though this might not be entirely necessary.

class HadamardProd(torch.autograd.Function):
    @staticmethod
    def forward(ctx, input, weight, bias=None):
          output = torch.mul(input, weight)
          if bias is not None:
              output = output + bias
          return output

That should help you along even though it doesn’t answer your specific question.

AlbertZhang · January 20, 2018, 11:12am

Sorry, I typed something wrong. HadamardProd does inherit from torch.autograd.Function in my original code. I don’t know whether leaving out the backward function would work, I 'll try it. Thanks for your suggestions.

AlbertZhang · January 20, 2018, 11:17am

Sorry, I typed something wrong. HadamardProd does inherit from torch.autograd.Function in my original code. Thanks for your suggestion. But leaving out the backward function doesn’t work. It raises a NotImplementedError.

jpeg729 · January 20, 2018, 11:21am

Weird. My implementation of the Inverse square root linear unit works fine.

class ISRLU(torch.autograd.Function):
    @staticmethod
    def forward(ctx, tensor, alpha=1):
        negatives = torch.min(tensor, torch.Tensor([0]))
        nisr = torch.rsqrt(1. + alpha * (negatives ** 2))
        return tensor * nisr

Just guessing: Are you using pytorch 0.3.0 ?

AlbertZhang · January 20, 2018, 11:24am

I guess, yes. My torch version is 0.3.0.post4.

AlbertZhang · January 20, 2018, 11:29am

I guess, yes. My torch version is 0.3.0.post4.

Well. I use this function to customize a layer Hadamard as

class Hadamard(nn.Module):
  def __init__(self, in_channels, in_height, in_width, bias=True):
	super(Hadamard, self).__init__()
	
	self.in_channels = in_channels
	self.in_height = in_height
	self.in_width = in_width
	self.weight = nn.Parameter(torch.Tensor(in_channels, in_height, in_width))
	if bias:
		self.bias = nn.Parameter(torch.Tensor(in_channels, in_height, in_width))
	else:
		self.register_parameter('bias', None)
	self.reset_parameters()

 def reset_parameters(self):
	stdv = 0.05
	for weight in self.parameters():
		 weight.data.uniform_(-stdv, stdv);

 def forward(self, input):
    return HadamardProd()(input, self.weight, self.bias)

And I utility this layer in my network. The forward propagation goes well but the backward crashes with above error.

jpeg729 · January 20, 2018, 11:36am

I have pytorch 0.3.0 without .post4 running on python 3.6 on ubuntu.

I can’t see why my ISRLU works without a backward function, and your HadamardProd doesn’t.

You might have some success replacing the call to HadamardProd in forward with the code from HadamardProd.forward.

AlbertZhang · January 20, 2018, 11:56am

Does it that there are parameters weight and bias to train cause this crash? I don’t know how to fix it.

AlbertZhang · January 20, 2018, 12:15pm

I have fixed this bug by modifying the last line in Hadamard module to return HadamardProd.apply(input, self.weight, self.bias). Umm…, Lucky me

jpeg729 · January 20, 2018, 12:16pm

Now I feel stupid. I should have seen that.

AlbertZhang · January 20, 2018, 12:25pm

Still appreciated for your help.