I’m working on a server with multiple GPUs (cuda:0 to 3). I’ve allocated all my model parameters to cuda:2. The default CUDA device has been set to cuda:2 as well. However, it seems that as soon as I make a copy of the parameters, something seems to be using up memory in cuda:0.
Am I doing something wrong? I’ve tried the following:
--------Example Code----------
torch.cuda.set_device(2) # This line doesn't seem to make any difference.
model = MyModel().cuda(2) # I've checked that all the parameters and named_parameters are in cuda:2
for param in self.semiconv_unet.parameters(): # a) For some reasons it takes up lots of memory in cuda:0 after running this line.
pass
for name, param in model.named_parameters(): # b) This eats up memory in cuda:0 as well.
pass
optimizer = torch.optim.Adam(model.parameters())
for g in optimizer.param_groups: # c) So does this.
pass
list(model.parameters()) # d) So does this
with torch.cuda.device(2): # e) This still takes up some memory in cuda:0 ,
for g in optimizer.param_groups: # but not so much. Around 200mb in my case.
pass
Pytorch version: 0.4.1 post2