While I do see some murmurs here and there about the ability to share weights between modules, it seems pretty frown upon.
I believe the better idea would be to use trainable parameters and make a functional call to conv2d using those trainable parameters.
Something like this would probably work for you:
import torch
import torch.nn as nn
import torch.nn.functional as F
class Conv(nn.Module):
def __init__(self, in_channels, out_channels, kernel_size):
super().__init__()
# Register parameters that are trainable
self.weight = nn.Parameter(torch.randn(out_channels, in_channels, kernel_size, kernel_size))
self.bias = nn.Parameter(torch.randn(out_channels))
def forward(self, x, stride, padding, dilation):
# Do a functional call so we can use the same weights but different arguments
return F.conv2d(
x, self.weight, bias=self.bias, stride=stride,
padding=padding, dilation=dilation
)
# Example creation of module
conv = Conv(2, 16, 5)
# Example input
x = torch.randn((8, 2, 16, 16))
# Example usage with different dilation values
y1 = conv(x, 1, 2, 1)
y2 = conv(x, 1, 2, 2)
y3 = conv(x, 1, 2, 3)
print(y1.shape, y2.shape, y3.shape)
You probably want to consider initializing the weights differently however.
Hey! Thanks for the answer! It does make sense! Thank you very much! But I was wondering how the gradients are computed. Is it like a normal convolutional layer?
Yes, it should be as if it was a typical normal convolution layer. In fact, if you peek at the source code, you’ll notice that the pytorch convolution modules end up just calling their functional counterpart.