Can some please provide exemplary PyTorch transformer to consume sequences of variable lengths?

I have some audio sequence with 6 channels. I hope to predict one class for each input sequence.
The shape of the input sequences looks like [batch_size, 6, Length], where “Length” can vary from 20 to 200.

I am new to transformers. Could someone please provide an exemplary PyTorch definition of transformer to consume such data? Thanks in advance.