In keras / Tensorflow, I believe you have to specify the size of the sequence, so it works well for fixed sequences. But with the PyTorch implementation, it seems that it can handle arbitrary sized inputs. Is that correct?
Since convolutions use a sliding window approach (in theory), you would need to specify the number of input channels. Defining the sequence length is not necessary, as the kernel can just be applied on whatever temporal shape is used.
I’m not sure about TF and would be surprised, if you have to set the sequence length beforehand.