For each forward pass, my data consists of 6 images and some other values, so right now I am using the batch dimension as 6 to feed the data into the model.
However, since I use DataParallel, my batch of 6 images is divided into 2 batches of 3 images (my GPU only has 10GB RAM), which is not desired to my model.
Are there any way I can keep 6 as my batch dimension?
nn.DataParallel will split the input batch by the number of GPUs. Assuming you are using 2 GPUs, you can increase the batch size to 12 so that each GPU will use 12/2=6 samples.
It depends where these calculations are done. If it’s outside the model the “global” batch size of 12 will be used. Inside the model in the forward method the “local” batch size of 6 will be used.
We generally recommend using DistributedDataParallel instead as it won’t show the memory imbalance associated with the default device in DataParallel and is not creating copies of the model in each iteration.