Curious about nn.TransformerDecoder

111378 · August 22, 2020, 4:46am

I am new at Transformer

My curiosity is this.

I will explain it with example.

I know there are two type of masks

subsquent mask and padding mask( of course there are memory mask in nn.Transformer but it was not in original Transformer so I ommitted)

memory is the output of encoder right?

so is it right that memory key padding mask(in TransformerDecoder) is same as src key padding mask (in nn.TransformerEncoder)

or keep it simple, when I want to use original Transfomer(in the paper), is it right that I don’t have to use any of memory mask and memory key padding mask in TransformerDecoder?