Adding bias to convolution output

Ajinkya_Ambatwar · May 24, 2020, 12:40pm

Hi,
This maybe a very simple question but it’s troubling me.
I am trying to write convolution using matrix multiplication using fold and unfold methods.

My matrix multiplication output after reshaping is (1, 24, 26, 26). I have to add bias to it which is a 1-D tensor or shape (24).
I am assuming that the bias tensor has to be expanded to the shape (1,24,26,26) such that each of the 26*26 elements for each of 24 indices are same as the original value.
Can someone suggest me the fastest way to add the bias value to the cnn output using pytorch functions?

Thanks!

googlebot · May 24, 2020, 1:07pm

for example

bias = nn.Parameter(torch.zeros(24,1,1))
…
x += bias

inplace op += is fine for autograd

Ajinkya_Ambatwar · May 24, 2020, 1:36pm

HI @googlebot thanks for the reply.
I already have bias value stored in the given tensor. I can’t initialize it again the way you said.
IN my case, I have the bias value coming from the state_dict of a pretrained network. I am trying to route it to matrix multiplication domain only for inference purpose.

Any comments on that?

googlebot · May 24, 2020, 6:13pm

then just reshape it to (24,1,1) instead, this aligns dimensions so that it broadcasts with other summand

Ajinkya_Ambatwar · May 25, 2020, 3:18pm

Hi @googlebot, I am creating this sample test to illustrate what I want.

# this is like the output of the matmul say
a = torch.tensor([[[[1,2,3], [4,5,6], [7,8,9]],[[10, 11,12], [13,14,15],[7, 8,16]]]]) 

a.shape
#prints torch.size([1,2,3,3])

#this is like bias say
b = torch.tensor([4,5])
print(b.shape)
#prints torch.size([2])

b.reshape(1,2,1,1)
#reshaping as you said

#adding
a+b

What I expect is

output = tensor([[[[ 1+4,  2+4,  3+4],
          [ 4+4,  5+4,  6+4],
          [ 7+4,  8+4,  9+4]],

         [[10+5, 11+5, 12+5],
          [13+5, 14+5, 15+5],
          [ 7+5,  8+5, 16+5]]]])

This is the way bias should be added to matmul output right?

What I get is

RuntimeError                              Traceback 
----> 1 a+b

RuntimeError: The size of tensor a (3) must match the size of tensor b (2) at non-singleton dimension 3

Please help me with this

dunefox · May 25, 2020, 3:27pm

In [4]: b = b.reshape(1,2,1,1)                                                                

In [5]: a + b                                                                                 
Out[5]: 
tensor([[[[ 5,  6,  7],
          [ 8,  9, 10],
          [11, 12, 13]],

         [[15, 16, 17],
          [18, 19, 20],
          [12, 13, 21]]]])

Does b.reshape return a copy? Seems like it’s not an in-place operation.

googlebot · May 25, 2020, 5:17pm

It is a new tensor, but it refers to the same memory (when possible). So, yes, it acts like a copy.

Ajinkya_Ambatwar · May 26, 2020, 12:43pm

@dunefox your method worked. Apparently just reshaping b is not inplace operation. So yes assigning the value back to b works in this case.
Thanks for the help @googlebot @dunefox!!