I am trying to set up a very simple Transformer (mostly to understand it). I want to have some input sequence (let’s say 50 512-dimensional vectors) and via the magic of the Transformer, I want to generate 50 512-dimensional vectors that are basically double the input.

Very simple, I know. But I can’t find a single example of how to do this. Every example uses embeddings somehow and vocabulary sizes, which seems a bit complex. Any help would be greatly appreciated.

Thanks!