-
Does this mean that the parameters in the two models src_encoder and tgt_encoder are exactly the same? What are buffers?
-
I still can’t understand the meaning of .max (1) [1]. This is context. Can you tell me what it does? Or tell me how should I modify the code?
-
Can you explain what the computation graph is?