TypeError: forward() got an unexpected keyword argument 'return_dict' BERT CLASSIFICATION HUGGINFACE with ray tuning

I think you are hitting this issue again.

Based on your last statement in the linked topic, I guess your output has the shape [batch size=2, seq_len=512, nb_classes=1024] while the target only contains the class indices for [batch_size=2].
This doesn’t work, since:

  • the target should contain values for all samples in the batch dimension as well as all samples in the temporal dimension (seq_len),
  • Once this is fixed, you would have to .permute(0, 2, 1) the model output as the class dimension is supposed to be in dim1.

If you don’t have a target value for each temporal step in the seq_len dimension, you would have to reduce this dimension in your model somehow.

