Can't do parallel computing

I want to load Resnet model and change last layer to fit my class numbers. But after I use data parallel function right after the modified model. it give this error.

RuntimeError: module must have its parameters and buffers on device cuda:0 (device_ids[0]) but found one of them on device: CPU

this is the code:

if args.resume:
         if os.path.isfile(args.resume):
             print("=> loading checkpoint '{}'".format(args.resume))
             checkpoint = torch.load(args.resume)
             args.start_epoch = checkpoint['epoch']
             best_prec1 = checkpoint['best_prec1']
             if 'optimizer' in checkpoint:
             print("=> loaded checkpoint '{}' (epoch {})"
                   .format(args.resume, checkpoint['epoch']))
             print("=> no checkpoint found at '{}'".format(args.resume))
cudnn.benchmark = True
#model = model.cuda()
model = torch.nn.DataParallel(model, device_ids=list(range(args.ngpu)))

some times it also has another problem:
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:1 and cuda:0! (when checking argument for argument weight in method wrapper_CUDA__cudnn_convolution)
Can anyone help me? thank you so much

I don’t see any obvious issues in the posted code snippet so could you create a minimal and executable code able to reproduce the issue?

I add torch.cuda.set_device(‘cuda:0’)
this works for me.
Thank you