I’m wondering whether we can control class distribution from dataloader to produce every minibatch to have the same class distribution or not. I understand that we can apply class weight to the loss function, however, it is ok for batch training but I’m not sure about model learning behavior when apply the fixed class weight to minibatch training that each minibatch could have different class distribution.

Or … we apply the fixed class distribution to the loss function and also apply sample weight to multiply directly to the loss for each minibatch?

You could use a WeightedRandomSampler as described here. Note that it’s a random sampling process and thus a perfectly balanced batch is not guaranteed.

Am I understand correctly that it is valid methodology to define just a cross entropy loss (for example) without assigning class weight but calculate the sample weight in each minibatch to multiply directly with the loss?

You are not multiplying the loss. The WeightedRandomSampler uses the passed weights to draw samples from the dataset ideally creating a balanced batch. The loss calculation is then a standard one without any modifications or weighting.