Hi !
I’m new to NLP and transformers. I have readen documentation on this (BERT, Attention is All you need, …).
I’m trying to implement a Bert model that I want to fine tune which compute two different classifications :
- first one : 2 classes
- second classifier : 5 classes
I have a my labels as :
\y :
tensor([[[[1],
[0]]],
[[[1],
[0]]]])
The outputs after model(x) :
Output :
[tensor([[ 0.1207, -0.2359],
[ 0.4030, -0.0475]], grad_fn=<AddmmBackward>), tensor([[ 0.1071, 0.1679, 0.0090, -0.7056, -0.1793],
[ 0.1295, -0.0781, 0.3041, -0.6385, 0.1090]],
grad_fn=<AddmmBackward>)]
As you can see, I have 2 classifier of size 2 and 5. and the labels corresponds to the index of each classifier output.
If I apply criterion as :
for i in range(2) : #for each label
loss+= criterion(output[i],y[i])
For i = 0, I print these two output[0] and y[0] :
output[i] :
tensor([[ 0.4083, -0.1396],
[ 0.4059, -0.3079]], grad_fn=<AddmmBackward>)
y[i] :
tensor([[[1],
[1]]])
I got the error :
ValueError: Expected input batch_size (2) to match target batch_size (1).
I’m stuck on this and I have the impression of forbid something about y format but I’m not sure. Someone could help me on this one ? Thanks you