Problem: When i train the Transformer, i found the attention values in encoder, decoder are nearly same.
Encoder:
You need to specify more details and put some of your code here to get more help
1 Like