I was wondering how I would go about shifting the Output for a Transformer when I also want to use the
<sos> token encodings produced by the Enocder for additional Classification. So I am training reconstruction as well as predictions at the same time.
I know I have to shift the input to the right. But in this case I would also require the
<sos> token of the source sequence.
This would be my idea but I am not if that is correct
src = <sos> I am fine <eos>
trg = <sos> I am fine