Does it make sense to mix the labels in each batch?

When using binary classification model:

  • When training a deep model, at each training step, the model receives a batch (i.e batch of size 32 samples).
  • Let’s assume that in each training batch there are always 16 samples with label ‘0’ and 16 samples with label ‘1’.
  1. Does it matter how those samples are arranged in the batch?
  2. Is there a difference if the 16 labels with “0” appear first and then with “1” vs mix of all 32 samples?

No and no. The sample order inside the batch won’t matter unless you treat it as a sequence inside your model (and thus as a sequence dimension, not a batch dimension). The overall shuffling of the dataset still matters, i.e. the batches in each epoch should be created via random sampling.