Hello! I am currently working on a custom module inside my pytorch model. This module creates a gaussian kernel that is convolved across the models input, while the parameters of the gaussian kernel are to be learned. There is to be 256 x 256 gaussian kernels generated. My init function looks like this:
def __init__(self, channel_num = 512, size = 32, ksizeDN = 12):
super().__init__()
self.channel_num = channel_num
self.size = size
self.ksizeDN = ksizeDN
self.thetaD = torch.nn.Parameter(uniform.Uniform(0, torch.pi).sample([self.channel_num, self.channel_num]),
requires_grad=True)
self.p = torch.nn.Parameter(uniform.Uniform(2, 6).sample([self.channel_num, self.channel_num]),
requires_grad=True)
self.sig = torch.nn.Parameter(uniform.Uniform(2, 6).sample([self.channel_num, self.channel_num]),
requires_grad=True)
self.a = torch.nn.Parameter(
torch.abs(torch.randn(self.channel_num, self.channel_num, requires_grad=True)))
#self.nU = torch.nn.Parameter(torch.abs(torch.randn(1, self.channel_num, 1, 1, requires_grad=True)))
self.gaussian_bank = torch.nn.Parameter(torch.zeros(self.channel_num, self.channel_num, self.ksizeDN * 2 + 1,
self.ksizeDN * 2 + 1), requires_grad=False)
self.x = torch.linspace(-self.ksizeDN, self.ksizeDN, self.ksizeDN * 2 + 1)
self.y = torch.linspace(-self.ksizeDN, self.ksizeDN, self.ksizeDN * 2 + 1)
self.xv, self.yv = torch.meshgrid(self.x, self.y)
for i in range(self.channel_num):
for u in range(self.channel_num):
self.gaussian_bank[i, u, :, :] = self.get_gaussian(i, u)
The first four parameters are variables in the gaussian kernel equation (self.get_gaussian), which is called when constructing the ‘gaussian bank’ in the for loop. The gaussian bank is 256 x 256 x 25 x 25. For reference, the gaussian kernel equation is defined as:
def get_gaussian(self, cc, oc):
xrot = (self.xv * torch.cos(self.thetaD[cc, oc]) + self.yv * torch.sin(self.thetaD[cc, oc]))
yrot = (-self.xv * torch.sin(self.thetaD[cc, oc]) + self.yv * torch.cos(self.thetaD[cc, oc]))
g_kernel = torch.tensor((abs(self.a[cc, oc]) /
(2 * torch.pi * self.p[cc, oc] * self.sig[cc, oc])) * \
torch.exp(-0.5 * ((((xrot) ** 2) / self.p[cc, oc] ** 2) +
(((yrot) ** 2) / self.sig[cc, oc] ** 2))))
return g_kernel
After running the training on this, I notice that even though the parameters are changing, the ‘gaussian bank’ parameter never changes. In fact, all the parameters (thetaD, a, sig and p) all seem to go to zero over multiple epochs, which does not seem right. How can I define the gaussian bank parameter so that the model updates the gaussian bank whenever the parameters are adjusted? I am a bit new to pytorch, so apologies for any confusion - I am willing to answer anything I missed and I am very appreciative for any help!