Hey! New to pytorch, coming from keras. Really loving it!
I dont know if the title of this post is correct as I did not really know how to formulate my question.
I am training a GAN in pytorch on the LSUN bedroom dataset. When I am training my discriminator with data from the generator I have noticed that I need to copy the data from the generator into a new tensor, otherwise my memory use will go up with every iteration. Why is this? I think there is something basic I have yet to understand so its best to ask instead of just accepting that that’s how you train multiple networks with eachothers outputs as inputs.
Here is the part of the code I am curious about:
fake = netG(get_noise(64))
#The line below is the one I am wondering about
fake = torch.Tensor(fake.size()).cuda().copy_(fake.data)
fake = Variable(fake)
prediction = netD(fake)
Without copying the data from the output of the generative network into a new tensor variable before using it as input to the discriminator network the memory use increases until an out-of-memory exception is thrown.
Using cuda for the networks and variables.
Thanks in advance!