Multi-gpu of model

The code snippets you’ve posted should already create the encoder on GPU0 and the decoder on GPU1.
You would just have to make sure the input tensors are on the right device.
Have a look at this small example for model sharding.