Hi there.
I currently have following setup: nn.Conv1d(IN_SIZE, IN_SIZE, WING * 2 + 1, padding=WING, groups=IN_SIZE)
Now I want to implement something I coined partially dilated convolution, in essence I want a convolutional layer that has a “dense island” in the middle. Something like this, where 0: represents a gap due to dilation and 1: a kernel 101010101111101010101. Inference is done on the centre element
I was wondering if you could suggest the simplest way to implement this kind of layer?

If I understand your use case correctly, the simplest approach would probably
be to instantiate two Conv1ds – one conv_dense with dilation = 1, and a
second conv_dilated with, say, dilation = 2. Pass the input to your “partially
dilated convolution” layer through both convolutions and then sum the results.

Some of the weights of conv_dilated will overlap with the “dense island”
implemented in conv_dense so there will be some redundancy between
the two sets of weight parameters. But that should be okay – most neural
networks have some (perhaps less obvious) redundancy and train and
predict just fine.

(You could also store the left dilated wing, the dense island, and the right
dilated wing in three separate trainable parameters, assemble your
“partially dilated” kernel from them, and then call the functional form, torch.nn.functional.conv1d(), with your assembled kernel. This
would avoid the redundancy and might prove to run faster, but would
be a bit of a nuisance to implement.)

Hi K. Frank,
thank you for your answer
I will probably just go for the first approach, since I’m short one time, but I will give the second one a try as well, simply because I think it would be a good exercise.

Just to make sure, I understand that approach correctly. One way to achieve this would be to apply a mask to the wings gradients, which zeros every 2nd gradient for the wings (similar to here). Essentially doing manual dilation, right ?

I wasn’t imagining doing anything with the gradients (like masking them after
they’ve been computed). Maybe you could make such a scheme work, but it
would seem quite roundabout to me.

I was suggesting encoding, say, the “left wing” in a trainable parameter, torch.tensor ([a, b, c, d], requires_grad = True), and then
“dilating” it into the relevant portion of the kernel you then pass to conv1d(), torch.tensor ([a, 0, b, 0, c, 0, d, 0, ...]). Autograd will track
gradients back through your assembled kernel to your trainable parameter
and its a, b, c, d elements.