I saved a model using save_state_dict on a machine with 4 GPU’s and I was using the device with id 3. Later, I tried to load the model in a machine with 2 GPU’s, which means id=3 does not work. This throws an error
cuda runtime error (10) : invalid device ordinal at torch/csrc/cuda/Module.cpp:80
Is this a bug? I believe we should be able to save and load models in different machines. We’re often sharing machines in labs and use the one which is available at that point in time!
torch.load('tensors.pt')
# Load all tensors onto the CPU
torch.load('tensors.pt', map_location=lambda storage, loc: storage)
# Map tensors from GPU 1 to GPU 0
torch.load('tensors.pt', map_location={'cuda:1':'cuda:0'})