I try to run a PGGAN using 1 GPU but I can see that Pytorch is not using GPU and the usage of the CPU is very high whereas Tensorflow has no problem to use my GPU.
I am using Cuda 10 and Pytorch 10 so I don’t think there is a version compatibility issue.
When I do “torch.cuda.is_available()” it tells me “True” and I can see that Pytorch is able to find my GPU.
Therefore, I am wondering is there is an issue with the code ? Here are the two parts where the problem might be but I am not able to find it :
In the general settings, I put 1 to say 1 GPU even if the ID of my GPU is 0 :
parser.add_argument('--n_gpu', type=int, default=1) # for Multi-GPU training.
And also here :
class trainer:
def __init__(self, config):
self.config = config
if torch.cuda.is_available():
self.use_cuda = True
torch.set_default_tensor_type('torch.cuda.FloatTensor')
else:
self.use_cuda = False
torch.set_default_tensor_type('torch.FloatTensor')
self.nz = config.nz
self.optimizer = config.optimizer
self.resl = 2 # we start from 2^2 = 4
self.lr = config.lr
self.eps_drift = config.eps_drift
self.smoothing = config.smoothing
self.max_resl = config.max_resl
self.trns_tick = config.trns_tick
self.stab_tick = config.stab_tick
self.TICK = config.TICK
self.globalIter = 0
self.globalTick = 0
self.kimgs = 0
self.stack = 0
self.epoch = 0
self.fadein = {'gen':None, 'dis':None}
self.complete = {'gen':0, 'dis':0}
self.phase = 'init'
self.flag_flush_gen = False
self.flag_flush_dis = False
self.flag_add_noise = self.config.flag_add_noise
self.flag_add_drift = self.config.flag_add_drift
# network and cirterion
self.G = net.Generator(config)
self.D = net.Discriminator(config)
print ('Generator structure: ')
print(self.G.model)
print ('Discriminator structure: ')
print(self.D.model)
self.mse = torch.nn.MSELoss()
if self.use_cuda:
self.mse = self.mse.cuda()
torch.cuda.manual_seed(config.random_seed)
if config.n_gpu==1:
self.G = torch.nn.DataParallel(self.G).cuda(device=0)
self.D = torch.nn.DataParallel(self.D).cuda(device=0)
else:
gpus = []
for i in range(config.n_gpu):
gpus.append(i)
self.G = torch.nn.DataParallel(self.G, device_ids=gpus).cuda()
self.D = torch.nn.DataParallel(self.D, device_ids=gpus).cuda()
Last information : I am doing all of that on Windows.
Thank you a lot for your help, I am trying to figure out what’s wrong since weeks without finding the right answer.