Hi apytorch,
First, you should ensure the C (channels) of samples in the same. You can normalize them by resampling in preprocess, or pooling/convolution with stride 2 in additional layer specially for the high sample rate.
For padding, you can sort the samples by length and do a local random. That will let the total padding size keep smaller. And maybe you will get faster convergence as well.