Hello. I have defined the following ConvLSTM class, code borrowed from here (GitHub - sladewinter/ConvLSTM) and I’ve skipped some boilerplate initializations.
class ConvLSTM(nn.Module):
def __init__(self, in_channels, out_channels,
kernel_size, padding, lstm_activation, frame_size):
super(ConvLSTM, self).__init__()
self.out_channels = out_channels
# We will unroll this over time steps
self.convLSTMcell = ConvLSTMCell(in_channels, out_channels, kernel_size,
padding, lstm_activation, frame_size)
def forward(self, X):
# Get the dimensions
batch_size, _, seq_len, height, width = X.size()
...
# Unroll over time steps
for time_step in range(seq_len):
H, C = self.convLSTMcell(X[:,:,time_step], H, C)
output[:,:,time_step] = H
return output, H, C
If I view the summary of this model using torchinfo, it shows trainable parameters only on the first ConvLSTM Cell and ‘(recursive)’ in the rest. However, if I print the model parameters using
print(sum(p.numel() for p in t.parameters() if p.requires_grad))
it reports a different number of trainable parameters.
How many trainable ConvLSTM cells are instantiated by writing the layer in this form? 1 and the output is obtained recursively or seq_len cells with trainable parameters in each one?