Is cross_entropy built into the transformer or can I choose MSELos?
If you are talking about this training, I think you can just change it here:
I have a sequence of inputs equal to 50. Each input has a vector of dimension 6. This is src_vocab. tgt_vocab is a number from 0 to 1 that corresponds to each sequence.
I have a lot of such sequences and each of them corresponds to a number from 1 to 0
To the network, I submit a two-dimensional vector 50X6, at the output I get a number from 1 to 0
Loss I need to calculate as MSELoss.
If you compare with the translation, then I have a sentence, and at the output as a translation I get a number.