I have followed this guide on transformers, and found out that the input/ouput size of transformerEncoder layer is -
Input -> [source sequence length, batch size, embedding dimension ]
Output -> [source sequence length, batch size, embedding dimension]
My question is that i have target sequence of length different than source sequence,
target → [target sequence length ,batch size, embedding dimension]
so can i somehow change the code, so that my output matches the dimension of transformer, or how do i calculate the loss, given the dimensions dont match?