Out of memory error when resume training even though my GPU is empty

Could you try to load the checkpoints onto the CPU first using the map_location argument in torch.load?
After it was successful, try to push your model onto the GPU again.

11 Likes