Hi!
I have a doubt that is if I train a model using multi-GPU, DistributedDataParallel training (say the resnet50 training on imagenet example). Later on, for some feature extraction task, if I want to use its frozen weights as it is, or just want to finetune some of its last layers, but on a single GPU with no Dataparallel or DistributedDataParallel - is that possible (if yes, then just torch.load() will work fine?)? Will there be an issue?
TIA