I have created this model :
return sum(p.numel() for p in model.parameters() if p.requires_grad)
a = nn.Module()
a.l1 = nn.Conv1d(90000, 300, 2, padding=0)
It gives me output as 54000300. I think it should give only 53999700. The reason is, I’m not padding the sequence, and because the filter length is 2, our model should ignore the last input, as there is no any padded input node after last node. So technically, the output number of parameters should be (89999) * (2) * (300). But instead, the output is (90000)(2)(300). It does not make any difference if I pad the sequence or not. Even if I pad the sequence with padding=1 or padding=2, my answer remains unchanged, i.e. 54000300. Also in the formula described in the documentation to calculate the output dimension, it does not take into account the padding. It always counts from 0 to Cin-1.
It should count the sum by taking into account the filter_size. If it is more than 1, it should ignore the columns from the back accordingly. For example, if filter size is 2, then it can not take last column for convolution, since the sequence is not padded. Similarly, for filter size of 3, it should ignore last two columns because of the same reason.