During the process of training the model, there are different ways to use the dataset:
total dataset is splitted into many tiny parts named mini-batch:
num_batch=total_sample / batch_size
for epoch: # here one epoch means the entire dataset
for num_batch:
batch = dataset[…num_betch…]
training…
2.sample randomly as a batch:
for epoch: # here one epoch means a batch sampled randomly
batch = random_sample(batch_size)
training…
What the difference between the aforementioned ways?
Let the training be supervised (supervised learning).
In this situation, each sample in total dataset has feature x and label y.
Usually, publicly available datasets such as MNIST and CIFAR 10/100 group all samples by labels.
So, if you sample mini batches in the way described below, some mini batches have only 1 class samples and are so biased.
On the other hand, sample mini batches at random would make your mini batches diverse.
In general, the latter makes trained model more geenralized.