Batch learning 4D Data

I have 4D data (N, 3, 64, 64) – basically N 64x64 RGB images where N is the set of rotations of the images – and I would like to use DataLoader. So, the tensor shape I desire is (batch_size, N, 3, 64, 64). However, it appears I can only feed DataLoader a tensor shape (batch_size, A, B, C). So I could only get DataLoader to work by setting batch_size = 1, and let N be viewed as the size of the batch. This approach does work, but it takes a very long time to train. It would be nice to run multiple batches of (Nx3x64x64) simultaneously using DataLoader. Is there a way to do this? If so, could you please provide an example of how to implement it?

I do not think there is a restriction for Maybe your network cannot use 5-D data when it expect 4-D?

Personally, when dealing with sets of different rotations, I have been able to “flatten” (using reshape) those into individual images in a batch so my final mini-batch would have the shape (batch_size x N, 3, 64, 64). Does this help?

When you flatten using reshape, is it correct to assume that the N samples are grouped together during training? That is, samples from one set of N don’t get mixed with samples from another set of N. How does one ascertain this? I suppose I may need to set shuffle=False during training to keep the N samples individually grouped after flattening, but that seems odd to do during training as shuffle is usually True during training and False during test.

This reshaping is done on the minibatch (5-D) before you input it to the network. shuffle can still be True.
I’m not 100% sure this is what you were asking but just by how reshape works, the first N elements among the N x batch_size would be the different rotations of the first image.

It is good to know that the order is kept.

Originally, I would have N x 1 tensor as an output of the network, and I would take a softmax over this tensor. Now, with an input of (batch_size x N, 3, 64, 64), I now have a tensor of size batch_size*N x 1 and must take a softmax over every N elements (batch_size times). I can only think of accomplishing this using a loop. Is there a better way to do this, since looping this way is slow? Even with looping is faster because of loading more data initially (batch_size x N samples) instead of just N samples. But with typical batch training, no looping is required at all with DataLoader.

You can reshape the output back to (batch_size, N) and use a softmax over dim=1

Thanks! That works great.