How to properly pass a distributed model to external modules?

I have noticed model = torch.nn.DataParallel(model, device_ids=[0, 2]) must be executed within the module that is using the model. That is, I cannot pass a model to another external module. Why is that and how can I pass models around without getting the RuntimeError: all tensors must be on devices[0]? (Yes, I do place my input on device 0).

So, I think this issue was caused by Jupyter Notebook.