Should I explicitly define mask for transformer decoder or it is already there by default?

seyeeet · August 14, 2021, 12:23am

Hi
I like to train a language model and want to use transformer for it. I have heard that masks in the dcoders are very importnat while training, for example some mask like this is used in training models like bert in decoder.

I was wondering if I have to define them or there are already in the model by default?