Store model parameters on gpus

Is there any Lib helps to store the parameters of the model on gpu’s global memory (or on host memory), instead of the disk.
I have read the torch.save, it seems like will save the model parameters on disk.

Best
Max

The state_dict which you would store to your disk via torch.save is already in the host or GPU RAM (it has to live somewhere before you can serialize it) so could you explain your use case a bit more, please?

Hi ptrblck,
Sorry for the unclear explanation, hope this time I could make it clear.

My aim:
After finishing the training of a model, for example Resnet512, could I save the updated parameters to somewhere on the global memory?

  1. If we could save parameters to somewhere on the global memory, could we load these parameters to set up a new Resnet512?
  2. If we could save parameters to somewhere on the global memory, is there anyway to keep it ‘alive’, even after finishing the training and stop the training process?

Best
Max

  1. The parameters are already stored in the global memory of your GPU assuming you are training the model on the GPU. If you want to create a copy, you could use copy.deepcopy to do so.

  2. No, you cannot use the GPU’s global memory for serialization and the memory will be released after your Python process exits.

Thank you for the explanation. That helps a lot. :slight_smile: