Hi, I run into some error that I couldn’t figure it out where’s wrong. Thanks for helping in advance!
I’m trying to apply gradients penalty on discriminator loss. However, I struggle with the line 142-144. I need to do the interpolate operation, but any operation I apply on fake_img would give me error “expected device cpu but got device cuda:0”.
I did print out the type and size for real_img, fake_img, and alpha. It turns out they are all type tensor, and size [16, 3, 128, 128]. I don’t know where went wrong.
The operation with real_img is good. Just when I deal with fake_img, that error shows up.
114 ############################
115 # (1) Update D network: maximize log(D(x)) + log(1 - D(G(z)))
116 ###########################
117 real_img = Variable(target)
118 target_real_out = Variable(torch.ones(batch_size, 1))
119
120 real_img.to(device)
121 target_real_out.to(device)
122 print(type(real_img), real_img.size())
123
124 z = Variable(data)
125 target_fake_out = Variable(torch.zeros(batch_size, 1))
126
127 z.to(device)
128 target_fake_out.to(device)
129
130 fake_img = netG(z)
131 print(type(Variable(fake_img.data)), Variable(fake_img.data).size())
132
133 netD.zero_grad()
134 real_out = netD(real_img)
135 fake_out = netD(Variable(fake_img.data))
136
137 alpha = torch.rand(batch_size, 1)
138 alpha = alpha.expand(batch_size,
real_img.nelement()//batch_size).contiguous().view(real_img.size())
139 alpha.to(device)
140 print(type(alpha), alpha.size())
141
142 interpolates = (1 - alpha) * fake_img
143 # interpolates = alpha * real_img + ((1 - alpha) * fake_img)
144 interpolates.to(device)
145
146 interpolates = Variable(interpolates, requires_grad=True)
147 disc_interpolates = netD(interpolates)
148 gradients = autograd.grad(outputs=disc_interpolates, inputs=interpolates,
149 grad_outputs=torch.ones(disc_interpolates.size()).to(device),
150 create_graph=True, retain_graph=True, only_inputs=True)[0]
151 gradients = gradients.view(gradients.size(0), -1)
152 gradient_penalty = ((gradients.norm(2, dim=1) - 1) ** 2).mean() * 10
153
154 d_loss = -(torch.mean(real_out) - torch.mean(fake_out)) + gradient_penalty
155
156
157
158 if torch.cuda.device_count() > 1:
159 d_loss.mean().backward(retain_graph=True)
160 else:
161 d_loss.backward(retain_graph=True)
162 optimizerD.step()
# generator parameters: 1769728
# discriminator parameters: 14499401
Let's use 4 GPUs!
Reading checkpoint...
Not find any generator model! Training from scratch with mse loss!
RTGAN Training Starts!
0%| | 0/632 [00:00<?, ?it/s]<class 'torch.Tensor'> torch.Size([16, 3, 128, 128])
<class 'torch.Tensor'> torch.Size([16, 3, 128, 128])
<class 'torch.Tensor'> torch.Size([16, 3, 128, 128])
0%| | 0/632 [00:09<?, ?it/s]
Traceback (most recent call last):
File "train_rtgan.py", line 142, in <module>
interpolates = (1 - alpha) * fake_img
RuntimeError: expected device cpu but got device cuda:0