Not understanding target in Transformer

Hi everyone, first post here so I’m sorry if I make any mistake about the forum. I was reading this article: A detailed guide to PyTorch’s nn.Transformer() module. | by Daniel Melchor | Towards Data Science about transformer and as an example he trained it to predict the continue of a sequence. What I’d like to do is reverse a sequence, by giving it as an input the sequence based on char to int dict conversion or one hit encoded array. I don’t really understand why he’s passing different target to the model (transformer) and to the loss function. I tried passing the same (the reversed sequence) to both but the model doesn’t seem to learn anything. Thanks in advance.

This tutorial might provide better clarity:

A mask is applied in the transformer, but it has nothing to do with the targets, which is where I think the confusion may be coming in. In fact, the mask is based only on the sequence length.

The mask just blocks the model from seeing the next tokens at each “time step”.

Thanks for the answer, i already looked into that tutorial provided by pytorch. Your answer dosen’t really solve my problem because i still don’t understand in the tutorial what target he’s feeding to the model (other than the mask as you say), which apparently is different from the one he’s using for the loss function.

model_input = ["Thanks", "for", "the"]
model_target = ["for", "the", "answer"] #offset by 1

The target in the example above is the word “answer”. The target is not fed to the model. Very simple example, but I’m not sure how else I can explain it.