I want to perform convolutions with custom kernels, but in a manner different than traditional convolutions. While convolution in pytorch occurs in general by each kernel slid and multiplied with each channel and then the sum passed over, I need to perform convolutions where input and output channels are same and each kernel (same as the number of channels) is multiplied with only a single channel from the input. And concatenated with the rest of the outputs from the other kernels. It looks like this now:
in_channels=out_channels
filters = torch.zeros((batch_size,in_channels,out_channels,f_size,f_size)
for batch in range(batch_size):
for chan in range(in_channels):
filter_use=filters[batch,chan].unsqueeze(0)
out_conv=F.conv2d(x[batch, chan].unsqueeze(0).unsqueeze(0), filter_use)
### Perform more operations
### Concatenate the results
While this gets the work done, it takes a huge amount of time as you can guess from the nested loop structure. What would be a faster alternative to this?