How to use/train Transformer in Pytorch

This answer explains the working process. I am unaware of the problem you are facing, hopefully the responses on the other thread can help you out.

1 Like