Hello,

## from the documentation on 3d convolution,

How to understand the D ? in (N, C, D, H, W)?

let’s say for example I have five video frames and I stack the frames along the channel dimension giving me :

a (1, 15, H, W) tensor assuming RGB frames. How do I reshape this tensor to (N, C, D, H, W)