Hi! I am implementing TDNN with PyTorch, and this structure is similar to 1D convolution. The only difference is that instead of continuous 1D window(for example 1 * 9) TDNN uses discontinuous 1D window(for example a 1 * 3 + skip 3 + 1 * 3 + skip 3 + 1 * 3). Currently I am using index_select to implement it, I wonder if there is some elegant method to implement it. Thanks
You can intialize some conv weights to zero and make sure they don’t change. For example if you want 1 * 3 + skip 3 + 1 * 3 + skip 3 + 1 * 3 we can use the mask: [1,1,1,0,0,0,1,1,1]
Update: I came up with the solution below as a hack. A way simpler and better way is to simply use pytorch functional API and supply your kernels to F.conv1d but then again you would need to stop those parameters from changing by managing the gradients.
class TDLayer(nn.Module): def __init__(self, in_channels, out_channels, kernel_mask, stride=1, padding=0, bias=True): super(TDLayer, self).__init__() self.conv_layer = nn.Conv1d(in_channels, out_channels, len(kernel_mask), stride, padding, bias) self.conv_mask = self._make_conv_mask(self.conv_layer, kernel_mask) # not a nn.Parameter so won't change self._init_kernels(self.conv_layer) # register hook self.conv_layer.weight.register_backward_hook(self._mask_hook) def _init_kernels(self, conv_layer): conv_layer.weight.data = conv_layer.weight.data * self.conv_mask def _make_conv_mask(self, conv_layer, kernel_mask): return kernel_mask.expand_as(conv_layer.weight) @staticmethod def _mask_hook(grad): return grad * self.conv_mask
You can use this in place of a Conv1d layer with:
my_layer = TDLayer(2, 3, torch.Tensor([1,1,1,0,0,0,1,1,1]))