CUDA out of memory error with to operation

Hi, I got an out of memory error with the code:

I was running with PyTorch 1.8.0 on a Ubuntu 20.04 machine, and used 2 Nvidia GTX 2080ti GPU for training. The CUDA version is 11.0. However, this error occured before DataParallel. The full traceback message is:

Traceback (most recent call last):
  File "", line 547, in <module>
  File "", line 143, in main
  File "", line 203, in train
    save_checkpoint(args, net, 1234, 1e6)
  File "", line 542, in save_checkpoint
  File "/data_c/lzq0330/miniconda3/envs/torch18/lib/python3.6/site-packages/torch/nn/modules/", line 673, in to    return self._apply(convert)
  File "/data_c/lzq0330/miniconda3/envs/torch18/lib/python3.6/site-packages/torch/nn/modules/", line 387, in _apply
  File "/data_c/lzq0330/miniconda3/envs/torch18/lib/python3.6/site-packages/torch/nn/modules/", line 387, in _apply
  File "/data_c/lzq0330/miniconda3/envs/torch18/lib/python3.6/site-packages/torch/nn/modules/", line 409, in _apply
    param_applied = fn(param)
  File "/data_c/lzq0330/miniconda3/envs/torch18/lib/python3.6/site-packages/torch/nn/modules/", line 671, in convert
    return, dtype if t.is_floating_point() or t.is_complex() else None, non_blocking)
RuntimeError: CUDA error: out of memory

Do anyone know a possible reason? The context of the code is:

save_checkpoint(args, net, 1234, 1e6)
def save_checkpoint(args, model, epoch, eval_loss):
    device = torch.device("cuda" if args.cuda else "cpu")
    checkpoint_model_dir = args.ckp_dir
    if not os.path.exists(checkpoint_model_dir):
    ckpt_model_filename = args.dataset_name + "_" + args.model_title + "_ckpt_epoch_" + str(epoch) + ".pth"
    ckpt_model_path = os.path.join(checkpoint_model_dir, ckpt_model_filename)
    state = {"epoch": epoch, "model": model, "eval_loss": eval_loss}, ckpt_model_path)
    print("Checkpoint saved to {}".format(ckpt_model_path))