What is the use of returning lengths?

I think it can be better understood by an example. Suppose we have a problem where I need to classify whether a video is a teaching class or a person practicing a sport. Videos are composed of a set of frames, where each one is an image. So, one way to go is to pass the set of frames through an LSTM model. Now suppose every frame of our video have `torch.size([C, H, W])`

, where C is the RGB channels, H is the height and W is the width of the image. We also have a set of videos, in any case, every video might have a different length; therefore, a different number of total frames. For example, the **video**_{1} have 5350 frames while **video**_{2} have 3323 frames. You can model **video**_{1} and **video**_{2} with the following tensors: `torch.size([5350, C, H, W])`

and `torch.size([3323, C, H, W])`

respectively. As you can see, both tensors have different sizes in the first dimension, which prevents us from *stacking* both tensors in only one tensor. To make this happens, we can save a tensor called `lengths = [5350, 3323]`

and then pad all videos tensors with zeros to make them have **equal length**, *i.e.*, both have the size of the biggest length, which is 5350, resulting in two tensors with the following shape: `torch.size([5350, C, H, W])`

. Then, after that, we can *stack* both tensor to obtain only 1 tensor with the following shape: `torch.size([2, 5350, C, H, W])`

, which means that 2 is the `batch_size`

(you can stack them with this function). But, as you can see, we have lost the information on the sequence when stacking both tensors, which means that for the tensor of **video**_{2}, all examples of `video2_tensor[3324:, ...]`

will have 0 as values. To remedy this, we need to use the `lengths`

vector to get the original sequence back, and not a bunch of zeros.

also, how would you use the built in `torch.nn.utils.rnn.pad_sequence`

in your example

Yes! You could use it, and your code seems fine to me. But why the `mask = (batch != 0).to(device)`

line?