I’m a little confused about the format of a video tensor for the function add_video
The documentation states that
vid_tensor: (N, T, C, H, W). The values should lie in [0, 255] for type uint8 or [0, 1] for type float
I would image that N=number of frames, C=channels, H=Height, W=Width
What is T? frame rate?