Thanks for your reply.
yes, it is saved as a state_dict
,
weights_dict = torch.load(weightspath)
simply loads it as a dictionary.
It seems to me that if the model is trained using GPU #3 on another machine, the saved model can not be loaded by a different machine with only 2 gpus.