I’m new to NLP and transformers. I have readen documentation on this (BERT, Attention is All you need, …).
I’m trying to implement a Bert model that I want to fine tune which compute two different classifications :
- first one : 2 classes
- second classifier : 5 classes
I have a my labels as :
\y : tensor([[[, ]], [[, ]]])
The outputs after model(x) :
Output : [tensor([[ 0.1207, -0.2359], [ 0.4030, -0.0475]], grad_fn=<AddmmBackward>), tensor([[ 0.1071, 0.1679, 0.0090, -0.7056, -0.1793], [ 0.1295, -0.0781, 0.3041, -0.6385, 0.1090]], grad_fn=<AddmmBackward>)]
As you can see, I have 2 classifier of size 2 and 5. and the labels corresponds to the index of each classifier output.
If I apply criterion as :
for i in range(2) : #for each label loss+= criterion(output[i],y[i])
For i = 0, I print these two output and y :
output[i] : tensor([[ 0.4083, -0.1396], [ 0.4059, -0.3079]], grad_fn=<AddmmBackward>) y[i] : tensor([[, ]])
I got the error :
ValueError: Expected input batch_size (2) to match target batch_size (1).
I’m stuck on this and I have the impression of forbid something about y format but I’m not sure. Someone could help me on this one ? Thanks you