I need to run a depthwise convolution with one feature map and multiple (N) filters. I am wondering what the most efficient way to do this may be. Here is a sequential version that works (with N=8), but I am hoping to parallelize it so adding more filters doesn’t add much more time.
import torch
import torch.nn.functional as F
feature_map = torch.rand((1,1024,50,50))
filters = torch.rand((8,1024,1,1))
feature_map,filters = feature_map.cuda(), filters.cuda()
all_correlations = []
for filter_ind in range(filters.shape[0]):
all_correlations.append(F.conv2d(feature_map,
filters[filter_ind,:].unsqueeze(1),
groups=feature_map.shape[1]))
all_correlations = torch.cat(all_correlations)