Batch learning 4D Data

anwu21 · September 23, 2020, 3:41pm

I have 4D data (N, 3, 64, 64) – basically N 64x64 RGB images where N is the set of rotations of the images – and I would like to use DataLoader. So, the tensor shape I desire is (batch_size, N, 3, 64, 64). However, it appears I can only feed DataLoader a tensor shape (batch_size, A, B, C). So I could only get DataLoader to work by setting batch_size = 1, and let N be viewed as the size of the batch. This approach does work, but it takes a very long time to train. It would be nice to run multiple batches of (Nx3x64x64) simultaneously using DataLoader. Is there a way to do this? If so, could you please provide an example of how to implement it?

Samarth_Mishra · September 23, 2020, 4:17pm

I do not think there is a restriction for torch.utils.data.DataLoader. Maybe your network cannot use 5-D data when it expect 4-D?

Personally, when dealing with sets of different rotations, I have been able to “flatten” (using reshape) those into individual images in a batch so my final mini-batch would have the shape (batch_size x N, 3, 64, 64). Does this help?

anwu21 · September 23, 2020, 4:58pm

When you flatten using reshape, is it correct to assume that the N samples are grouped together during training? That is, samples from one set of N don’t get mixed with samples from another set of N. How does one ascertain this? I suppose I may need to set shuffle=False during training to keep the N samples individually grouped after flattening, but that seems odd to do during training as shuffle is usually True during training and False during test.

Samarth_Mishra · September 23, 2020, 5:14pm

This reshaping is done on the minibatch (5-D) before you input it to the network. shuffle can still be True.
I’m not 100% sure this is what you were asking but just by how reshape works, the first N elements among the N x batch_size would be the different rotations of the first image.

anwu21 · September 23, 2020, 11:39pm

It is good to know that the order is kept.

Originally, I would have N x 1 tensor as an output of the network, and I would take a softmax over this tensor. Now, with an input of (batch_size x N, 3, 64, 64), I now have a tensor of size batch_size*N x 1 and must take a softmax over every N elements (batch_size times). I can only think of accomplishing this using a loop. Is there a better way to do this, since looping this way is slow? Even with looping is faster because of loading more data initially (batch_size x N samples) instead of just N samples. But with typical batch training, no looping is required at all with DataLoader.

Samarth_Mishra · September 24, 2020, 12:12am

You can reshape the output back to (batch_size, N) and use a softmax over dim=1

anwu21 · September 24, 2020, 1:04am

Thanks! That works great.