I am a new “convert” from tensorflow…
I want to use nn.Transformer for a non NLP job, mainly a seq2seq job…
I need a simple example, where I would overfit it on one example (let’s say srcseq=[1,2,3], dstseq=[4,5,6])
I need both the training & the inference code…
Can someone help a new convert
Welcome. See the tutorial here
Tutorial is just demonstrating the encoder layer not the Transformer model…
I need to know how to train the model. And how to autoregressivly generate seq
Depending on the problem, you need TransformerEncoder or Transformer (encoder + decoder). Training the model is similar to the training function in the tutorial. For non-NLP problems, you need to figure out how to calculate the loss. Unless you show more details about your problem, I cannot give more help. The tutorial is a word-language-model problem and, more concise, predicting the next word following a sequence.