Hello,
I have sequences of 2d images, each image is a single frame from a video.
With the length of each sequence given, the shape of the input is [batch_size, seq_length, c, h, w]
, so I want to process them with nn.Conv3d()
. But I met some unexpected problem that I had to transpose the input data into the shape of [batch_size, c, seq_length, h, w]
, and finally the performance is poor (i.e. segmentation task of sequences of medical images).
So, Could I consider the sequences of images as 3D task and process them with conv3d?
Thanks in advance!