For example, If I have a dataset with 10 rows. I want to train an MLP/RNN/CNN on this using mini batches.
So, let’s say, I take 2 rows at a time to train. 2 x 5 = 10. So, I train my model with batches where each batch contains 2 rows. So, number of batches = 5 and number of rows per batch is 2.
Is my batch_size 2? or is it 5?
In the dataloader, should my batch_size = 2 or batch_size = 5?
2 Likes
In your example, batch_size=2
.
Once you give the batch_size
, Dataloader
will take care of splitting the dataset into batches (5 in your case) and provide it to you.
1 Like
Thank you for clarifying!
The same definition of batch_size holds in case of an RNN as well?
Yes. The same definition of batch_size
applies to the RNN as well.
But the addition of time steps
might make things a bit tricky (RNNs take input as batch x time x dim
as input, assuming all the data instances in the batch are padded to have same number of time steps).
Also, take care of batch_first=True/False
option in RNNs.