i have ResNet 3D with 4 input (batch, number_video_frame, height=112, width=112)
and my dataset(BATCH x FRAMES x CHANNELS x HEIGHT x WIDTH)
how to reshape dataset outputs to make it work with ResNet 3D ?
ResNet 3D network has input of 5 dimensions includin batch size. 3D means 3 number of channels in the input. Are you sure you are using ResNet3D?
Yeah, so it means mini-batches of 3-channel RGB clips. The batch size is the 5th dimension. 3 is the number of channels which is the 4th dimension. T denotes the number of frames → 3rd dimension and spatial size of the frame are the last two dimensions.
For e.g. 16, 3, 16, 112, 112 → Here 16 is batch size. This should be the input to your 3D model.
thanks for help but i have torch.Size([3, 15, 112, 112]) how add batch size to be like this ([32,3, 15, 112, 112]) ?
torch.stack — PyTorch 1.8.1 documentation should do the trick – just take your 32 tensors and stack them into the desired batch
1 Like