For each of my sentence the 0 labels are very less as compared to the 1’s (for token level classification). I use batches and calculate loss(CrossEntropy) after each batch. How should i create the class weights vector and use it in the loss calculation. Please suggest !
You could initialize the weights as e.g. the class frequency which would then add more weight to the rare classes.
Thanks @ptrblck , so it would be like class_freq_zeros = 1 / <no of 0 tokens in whole batch of 8 sentences>, and similarly for 1’s ?
Yes, a per-batch weighting would work, but you should also consider checking the class distribution of the entire dataset and set the weights once before training to see which approach would be better.