Hello. I have defined the following ConvLSTM class, code borrowed from here (GitHub - sladewinter/ConvLSTM) and I’ve skipped some boilerplate initializations.
class ConvLSTM(nn.Module): def __init__(self, in_channels, out_channels, kernel_size, padding, lstm_activation, frame_size): super(ConvLSTM, self).__init__() self.out_channels = out_channels # We will unroll this over time steps self.convLSTMcell = ConvLSTMCell(in_channels, out_channels, kernel_size, padding, lstm_activation, frame_size) def forward(self, X): # Get the dimensions batch_size, _, seq_len, height, width = X.size() ... # Unroll over time steps for time_step in range(seq_len): H, C = self.convLSTMcell(X[:,:,time_step], H, C) output[:,:,time_step] = H return output, H, C
If I view the summary of this model using torchinfo, it shows trainable parameters only on the first ConvLSTM Cell and ‘(recursive)’ in the rest. However, if I print the model parameters using
print(sum(p.numel() for p in t.parameters() if p.requires_grad))
it reports a different number of trainable parameters.
How many trainable ConvLSTM cells are instantiated by writing the layer in this form? 1 and the output is obtained recursively or seq_len cells with trainable parameters in each one?