Transformer train_model

Is cross_entropy built into the transformer or can I choose MSELos?

If you are talking about this training, I think you can just change it here:

I have a sequence of inputs equal to 50. Each input has a vector of dimension 6. This is src_vocab. tgt_vocab is a number from 0 to 1 that corresponds to each sequence.
3

I have a lot of such sequences and each of them corresponds to a number from 1 to 0

To the network, I submit a two-dimensional vector 50X6, at the output I get a number from 1 to 0

Loss I need to calculate as MSELoss.
If you compare with the translation, then I have a sentence, and at the output as a translation I get a number.