For example, If I have a dataset with 10 rows. I want to train an MLP/RNN/CNN on this using mini batches.
So, let’s say, I take 2 rows at a time to train. 2 x 5 = 10. So, I train my model with batches where each batch contains 2 rows. So, number of batches = 5 and number of rows per batch is 2.
Is my batch_size 2? or is it 5?
In the dataloader, should my batch_size = 2 or batch_size = 5?
In your example,
Once you give the
Dataloader will take care of splitting the dataset into batches (5 in your case) and provide it to you.
Thank you for clarifying!
The same definition of batch_size holds in case of an RNN as well?
Yes. The same definition of
batch_size applies to the RNN as well.
But the addition of
time steps might make things a bit tricky (RNNs take input as
batch x time x dim as input, assuming all the data instances in the batch are padded to have same number of time steps).
Also, take care of
batch_first=True/False option in RNNs.