Hy everyone,
I’ve implemented a crude version of locally connected layers which can be trivially thought as a matrix of Conv layer applied to portions of images the same size as the kernels. I’ve implemented with ModuleList as follows:
class LocallyConnected2d(nn.Module):
"""
Implementation of LocallyConnected Layer described in
https://www.cs.toronto.edu/~ranzato/publications/taigman_cvpr14.pdf
"""
def __init__(self)
# Matrix of convolutional layer W_out X H_out
self.convs = nn.ModuleList([nn.ModuleList(
[nn.Sequential(nn.BatchNorm2d(input_channels).to(device), nn.Conv2d(in_channels=input_channels,
out_channels=num_channels,
kernel_size=kernel_size,
stride=(1, 1)).to(device)) for _ in
range(self.output_size[1])]).to(
device) for _
in
range(self.output_size[0])])
def forward(self, x: torch.Tensor) -> torch.Tensor:
# Concatenate the activations produced by the different convolutional layers
x = F.leaky_relu(x)
y = [[self.convs[i][j](x[:, :, (i * self.stride[0]):(i * self.stride[0] + self.kernel_size[0]),
(j * self.stride[1]):(j * self.stride[1] + self.kernel_size[1])])
for j in range(self.output_size[1])]
for i in range(self.output_size[0])]
y = torch.cat([torch.cat(y[i], dim=3) for i in range(self.output_size[0])], dim=2)
return y
The problem is that the gradients of Conv layers are always 0. Where is the error?