The height of each tensor varies as they refer to the number of frames for each sample.
I have decided that I want to consider 8 frames for each sample. I understand I have to do padding and truncate (for heights above 8), but somehow just doing the padding worked, or so it seems. I wish to understand how my code worked.
I’m unsure if our definition of “truncating” a tensor is the same, but “slicing” in this case you mean you are indexing the dimension from 0 to 7 (including) as seen in my comparison between the original tensor and the output: