Adding bias to convolution output

This maybe a very simple question but it’s troubling me.
I am trying to write convolution using matrix multiplication using fold and unfold methods.

My matrix multiplication output after reshaping is (1, 24, 26, 26). I have to add bias to it which is a 1-D tensor or shape (24).
I am assuming that the bias tensor has to be expanded to the shape (1,24,26,26) such that each of the 26*26 elements for each of 24 indices are same as the original value.
Can someone suggest me the fastest way to add the bias value to the cnn output using pytorch functions?


for example

bias = nn.Parameter(torch.zeros(24,1,1))

x += bias

inplace op += is fine for autograd

HI @googlebot thanks for the reply.
I already have bias value stored in the given tensor. I can’t initialize it again the way you said.
IN my case, I have the bias value coming from the state_dict of a pretrained network. I am trying to route it to matrix multiplication domain only for inference purpose.

Any comments on that?

then just reshape it to (24,1,1) instead, this aligns dimensions so that it broadcasts with other summand

Hi @googlebot, I am creating this sample test to illustrate what I want.

# this is like the output of the matmul say
a = torch.tensor([[[[1,2,3], [4,5,6], [7,8,9]],[[10, 11,12], [13,14,15],[7, 8,16]]]]) 

#prints torch.size([1,2,3,3])

#this is like bias say
b = torch.tensor([4,5])
#prints torch.size([2])

#reshaping as you said


What I expect is

output = tensor([[[[ 1+4,  2+4,  3+4],
          [ 4+4,  5+4,  6+4],
          [ 7+4,  8+4,  9+4]],

         [[10+5, 11+5, 12+5],
          [13+5, 14+5, 15+5],
          [ 7+5,  8+5, 16+5]]]])

This is the way bias should be added to matmul output right?

What I get is

RuntimeError                              Traceback 
----> 1 a+b

RuntimeError: The size of tensor a (3) must match the size of tensor b (2) at non-singleton dimension 3

Please help me with this

In [4]: b = b.reshape(1,2,1,1)                                                                

In [5]: a + b                                                                                 
tensor([[[[ 5,  6,  7],
          [ 8,  9, 10],
          [11, 12, 13]],

         [[15, 16, 17],
          [18, 19, 20],
          [12, 13, 21]]]])

Does b.reshape return a copy? Seems like it’s not an in-place operation.

It is a new tensor, but it refers to the same memory (when possible). So, yes, it acts like a copy.

@dunefox your method worked. Apparently just reshaping b is not inplace operation. So yes assigning the value back to b works in this case.
Thanks for the help @googlebot @dunefox!!

1 Like