Conv2d with 5D tensor


I have a list of frames (X_train) represented as a tensor, it has the shape of (5,150,200), so 5 frames of size 150x200.

I want to use X_train in a conv2d layer which I believe takes in a 3D (unbatched) or 4D (batched) tensor. My tensor is 5D (batch,channel,dim,height,width) because of it being multiple frames. I tried making it (batch,channel,height,(dim*width)), but then I get the error: ValueError: Expected input batch_size (1) to match target batch_size (5).

How could I solve this? I am very new to pytorch and despite reading documentation I have a hard time figuring it out.

Any help is appreciated!

Hi Vilma!

Speculating as to the details of your use case, I assume that you want to
apply your Conv2d to each of the five frames independently, that is, you
don’t want the five frames to mix together. (Let me further assume that
you do want the channels to mix together as they normally would when a
Conv2d is applied to a [batch, channel, height, width] 4d tensor.)

If this is the case, you would use transpose() to bring the dim (frames)
and batch dimensions together, and then use reshape() to merge those
two dimensions into a single “batch” dimension:

input4d = input5d.transpose (1, 2).reshape (batch * dim, channel, height, width)
result4d = conv2d (input2d)

(You could then, if appropriate, convert result4d back into your 5d format
by calling reshape() and transpose() to undo the original transpose()
and reshape().)


K. Frank