What's the difference between different ways to use the traing data?

Derek · May 29, 2018, 1:44am

During the process of training the model, there are different ways to use the dataset:

total dataset is splitted into many tiny parts named mini-batch:
num_batch=total_sample / batch_size
for epoch: # here one epoch means the entire dataset
for num_batch:
batch = dataset[…num_betch…]
training…
2.sample randomly as a batch:
for epoch: # here one epoch means a batch sampled randomly
batch = random_sample(batch_size)
training…

What the difference between the aforementioned ways?

crcrpar · May 29, 2018, 4:01am

Hi,

Let the training be supervised (supervised learning).
In this situation, each sample in total dataset has feature x and label y.
Usually, publicly available datasets such as MNIST and CIFAR 10/100 group all samples by labels.

So, if you sample mini batches in the way described below, some mini batches have only 1 class samples and are so biased.

On the other hand, sample mini batches at random would make your mini batches diverse.
In general, the latter makes trained model more geenralized.