Module List gradients are None

Hy everyone,
I’ve implemented a crude version of locally connected layers which can be trivially thought as a matrix of Conv layer applied to portions of images the same size as the kernels. I’ve implemented with ModuleList as follows:

class LocallyConnected2d(nn.Module):
    """
    Implementation of LocallyConnected Layer described in
    https://www.cs.toronto.edu/~ranzato/publications/taigman_cvpr14.pdf
    """

    def __init__(self)
        # Matrix of convolutional layer W_out X H_out
        self.convs = nn.ModuleList([nn.ModuleList(
            [nn.Sequential(nn.BatchNorm2d(input_channels).to(device), nn.Conv2d(in_channels=input_channels,
                                                                                out_channels=num_channels,
                                                                                kernel_size=kernel_size,
                                                                                stride=(1, 1)).to(device)) for _ in
             range(self.output_size[1])]).to(
            device) for _
            in
            range(self.output_size[0])])

    def forward(self, x: torch.Tensor) -> torch.Tensor:
        # Concatenate the activations produced by the different convolutional layers
        x = F.leaky_relu(x)
        y = [[self.convs[i][j](x[:, :, (i * self.stride[0]):(i * self.stride[0] + self.kernel_size[0]),
                               (j * self.stride[1]):(j * self.stride[1] + self.kernel_size[1])])
              for j in range(self.output_size[1])]
             for i in range(self.output_size[0])]
        y = torch.cat([torch.cat(y[i], dim=3) for i in range(self.output_size[0])], dim=2)

        return y

The problem is that the gradients of Conv layers are always 0. Where is the error?

Auto-fix: I’ve missed loss.backward(retain_graph=True)