Training BERT for multi-classfication: ValueError: Expected input batch_size (1) to match target batch_size (512)

Will this help? Loss functions for batches