let’s say I have a Tensor
X with dimensions
[batch, channels, H, W]
then I have another tensor b that holds bias values for each channel which has dims
y = x + b
Is there a nice way to broadcast this over H and W for each channel for each sample in the batch without using a loop.
If i’m convolving I know I can use the bias field in the function to achieve this, but I’m just wondering if this can be achieved just with primitive ops (not using explicit looping)
If you can make
b to be of size
1 x channels x 1 x 1, it is possible.
You can do something like below:
y = x + b[None, :, None, None]
I tried the opposite
y = x[None, :, None, None] + b and gave up haha.
Basically the trick is to expand ‘b’ to match ‘x’ axis template
Thanks a lot for the answer
basically, we use
None in the array index to introduce a singleton dimension. (similar to
tensor.unsqueeze(dim=N) for some
When you use
x[None,:, None, None], three more singleton dimensions will be introduced in
x and it will make the operation as an addition between the
7-dimensional tensor and