I followed the tutorial given here. However, the implementation for Transformer is significantly different in the pytorch codebase. The latter being closer to the the proposed approach by the authors.
Can someone guide me how to use the pytorch transformer to do a sequence to sequence translation task. I have described below the problem in some detail.
Transformer(src, tgt) parameters:
src: the sequence to the encoder (required), tgt: the sequence to the decoder (required). EDIT: For example, English Language dataset
src: The dataset is
[32, 5, 256] where
32 represents the total sentences in the database,
5 are the words in every sentence and
256 are the embeddings for each of the
tgt: I don’t know what to provide for this argument to the Transformer.
EDIT: I have a similar dataset for French say the shape is [32, 7, 256]
Assume that the positional encodings have been added to the above src