Hey guys,
So I have a batch of convolutional filters and a batch of images. What is the best way to perform the per-element convolution so it is executed in parallel (without iterating through the batch indices).
Thanks!
Hey guys,
So I have a batch of convolutional filters and a batch of images. What is the best way to perform the per-element convolution so it is executed in parallel (without iterating through the batch indices).
Thanks!
So, you have filters in a tensor w
of shape:
[batch_size, out_channels, in_channels, kernel_height, kernel_width]
,
and images in a tensor x
of shape:
[batch_size, in_channels, in_height, in_width]
,
and you want an output of shape:
[batch_size, out_channels, out_height, out_width]
,
where the i_th output is the convolution of the i_th image by the i_th filter ?
You can do it by merging the batch and channel dimensions together, and use a grouped convolution, with groups=batch_size
:
o = torch.nn.functional.conv2d(
x.view(1, batch_size*in_channels, x.size(2), x.size(3)),
w.view(batch_size*out_channels, in_channels, w.size(3), w.size(4)),
groups=batch_size)
o = o.view(batch_size, out_channels, o.size(2), o.size(3))
It works ! But is it the best way to do it ? I don’t know.
Let me know if this ends up faster.
Hey phan_phan,
Thanks. Your solution is correct. It seems to be ever so slightly faster than simply iterating over the batch dimension.
Cheers