Can anyone kindly take a look at my jupyter notebook and kindly let me know what part I did is erroneous? Thanks so much… Here is the unlearning model I spent a whole day working on: https://github.com/tlkahn/my-notebooks/blob/master/GAN-pytorch.ipynb
Just by skimming through your code, it seems you are freeing the discriminator:
class GAN(nn.Module):
"""GAN model"""
def __init__(self, generator, discriminator):
super().__init__()
self.generator = generator
self.discriminator = discriminator
for param in self.discriminator.parameters():
param.requires_grad = False
def forward(self, x):
gen_img = self.generator(x)
return self.discriminator(gen_img)
Which will not train it. Is this on purpose or were you planning on unfreezing the parameters at some point?
Here are two GAN in pytorch that are pretty simple and easy to follow if they help you.
Here is how to set up two models GEN and DESCRIM and train them
# Set models for training
Disc.train()
Gen.train()
for epoch in range(num_epochs):
# Each batch
for batch_i, (real_images, _) in enumerate(train_loader):
batch_size = real_images.size(0)
## Important rescaling step ##
# rescale input images from [0,1) to [-1, 1)
real_images = real_images*2 - 1
# Train Discriminator
d_optimizer.zero_grad()
# Train with real images
D_real = Disc(real_images)
d_real_loss = real_loss(D_real, smooth=True) # Use label smoothing
# Next step train with fake images
# Generate fake images
z = np.random.uniform(-1, 1, size=(batch_size, z_size)) # Random Noise
z = torch.from_numpy(z).float() # Convert to a flot
fake_images = Gen(z) # Predict with Generator (DO NOT TRAIN GENERATOR HERE) train one at a time
# Computer fake loss
D_fake = Disc(fake_images)
d_fake_loss = fake_loss(D_fake)
# add up loss and back prop
d_loss = d_real_loss + d_fake_loss
d_loss.backward()
d_optimizer.step()
'''
# Generator training
'''
g_optimizer.zero_grad()
# Generator fake images and train
z = np.random.uniform(-1, 1, size=(batch_size, z_size))
z = torch.from_numpy(z).float()
fake_images = Gen(z)
# Compute the discriminator losses on fake images
# using flipped labels!
D_fake = Disc(fake_images)
g_loss = real_loss(D_fake) # use real loss to flip labels
# perform backprop
g_loss.backward()
g_optimizer.step()
# Print some loss stats
if batch_i % print_every == 0:
# print discriminator and generator loss
print('Epoch [{:5d}/{:5d}] | d_loss: {:6.4f} | g_loss: {:6.4f}'.format(
epoch+1, num_epochs, d_loss.item(), g_loss.item()))
## AFTER EACH EPOCH##
# append discriminator loss and generator loss
losses.append((d_loss.item(), g_loss.item()))
# generate and save sample, fake images
Gen.eval() # eval mode for generating samples
samples_z = Gen(fixed_z)
samples.append(samples_z)
Gen.train() # back to train mode
Thanks a lot. I pulled those lines out of GAN model, and move them to later training phase and it seems to be working.
for param in self.discriminator.parameters():
param.requires_grad = False
Thanks. Very useful references
Hi Everyone i am getting RuntimeError: element 11 of tensors does not require grad and does not have a grad_fn. when I am running a GAN Architecture for getting the Gaussian Model and I am using WGANLOSS as well.
How to fix the error so that i can get th required output for my model.
Double post from here.
Hi Ptrblck,
I hope you are well. Sorry, I need to run a 3D GAN. my inputs are gray scale patches in 3D and I want to create 3D patch as well.
Can I use any 2D GAN, and just convert any 2d to 3d?
Could you please suggest me any pytorch link to have DCGAN in 3D ?
Many thanks
I don’t know any recent 3D DCGAN implementations, but you could try to use the approach of Voxel DCGAN or 3DGAN, which are both a bot older by now.
If you are working with static volumetric shapes, you could use the depth dimension as the channel dimension in a standard 2D GAN, although I don’t know, how well this would work.
I think your suggestion makes sense and you could try to replace all nn.*2D
layers with their 3D equivalent.
Many thanks .I will tell you the results
Hi Ptrblck,
Sorry, I am using 2D-DCGAN generator and convert it to the 3D. The error is : Given input size per channel: (1 x 1 x 1). Calculated output size per channel: (5 x 5 x -157). Output size is too small.
I expect that instead of -157 to see ((5x5x33) (13x13x21),(21x21x11), would you please help me to find out why this happen? I started by 101x1x1 and my target output is (21x21x11)
class Generator(nn.Module):
def __init__(self, nz):
super(Generator, self).__init__()
self.nz=nz
self.main = nn.Sequential(
nn.ConvTranspose3d(101,33,kernel_size=(5,5,5), stride=(2,2,2), padding=(0,0,86), bias=False),
nn.BatchNorm3d(33),
nn.ReLU(True),
nn.ConvTranspose3d(33, 21,kernel_size=(5,5,5), stride=(2,2,2) , padding=(0,0,24), bias=False),
nn.BatchNorm3d(21),
nn.ReLU(True),
nn.ConvTranspose3d(21, 11, kernel_size=(5,5,5), stride=(2,2,2), padding=(4,4,17), bias=False),
nn.Tanh())
def forward(self, input):
return self.main(input)
#### call the generator nz=101
netG = Generator(101).to(device)
if (device.type == 'cuda') and (ngpu > 1): netG = nn.DataParallel(netG, list(range(ngpu)))
netG.apply(weights_init)
noise = torch.randn(b_size,nz,1,1,1,device=device)
fake = Generator(noise)
Hi Ptrblck,
I want to use zero mean and unit normalization for the discriminator inputs (not rescalling between [-1 1]). in this case I should remove the Tanh from generator. which activation function u recommend instead of Tanh in generator in the last layer? or I can run without any activation function in the end?
Hi, if your output spans between 0 and 1, you could use nn.Sigmoid()
Without using an activation will make your output an arbitrary number not between -1,+1 or 0,1.
Hi Ptrblck,
I implement my conditional Gan. The batch size is 64 and input patch size of 21x21, and the condition which I pass is 168 different volumes number which can be from 1 to 100 for example =tensor([ 3, 36, 19, 12, 16, 6, 7, 2, 45, 12, 65, 44, 17, 8, 15, 15, 14, 47, 20, 9, 16, 25, 56, 11, 22, 8, 5, 3, 7, 6, 25, 10, 36, 1, 17, 2, 22, 3, 10, 13, 9, 14, 15, 11, 20, 16, 3, 10, 4, 18, 1, 15, 9, 6, 16, 55, 1, 14, 6, 17, 6, 6, 10, 7]).
- img_size=21
- N_Class=168 (168 different volumes as conditions)
3)lr1=0.0002
4)lr2=0.0002
5)batch size =64
6)optimized Adam as default
7)ngf = 64
8)criterion = nn.BCELoss()
real_label = 1
fake_label = 0
I applied this code. There is no error from my code and it runs, but it makes noise for me and fake images are not meaningful they are just noise.
Would you please help me with that? The code is
# custom weights initialization called on netG and netD
def weights_init(m):
classname = m.__class__.__name__
if classname.find('Conv') != -1:
nn.init.normal_(m.weight.data, 0.0, 0.02)
elif classname.find('BatchNorm') != -1:
nn.init.normal_(m.weight.data, 1.0, 0.02)
nn.init.constant_(m.bias.data, 0)
# Generator Code
class Generator(nn.Module):
def __init__(self):
super(Generator, self).__init__()
self.l1= nn.Sequential(nn.ConvTranspose2d(100, ngf * 4, 3, 1, 0, bias=False),
nn.BatchNorm2d(ngf * 4), nn.ReLU(True))
self.l2 = nn.Sequential(nn.ConvTranspose2d(168, ngf * 4, 3, 1, 0, bias=False),
nn.BatchNorm2d(ngf * 4), nn.ReLU(True))
self.l3=nn.Sequential(nn.ConvTranspose2d(ngf * 8, ngf *4 , 3, 1, 0, bias=False),
nn.BatchNorm2d(ngf * 4),
nn.ReLU(True))
self.l4=nn.Sequential(nn.ConvTranspose2d( ngf * 4, ngf * 2, 3, 1, 0, bias=False),
nn.BatchNorm2d(ngf * 2),
nn.ReLU(True))
self.l5=nn.Sequential(nn.ConvTranspose2d( ngf * 2, ngf, 3, 2, 1, bias=False),
nn.BatchNorm2d(ngf),
nn.ReLU(True))
self.l6=nn.Sequential(nn.ConvTranspose2d( ngf, 1, 3, 2, 3, bias=False),nn.Sigmoid())
def forward(self, input,Volume):
x =self.l1(input)
y = self.l2(Volume)
xx = torch.cat([x, y], 1)
output = self.l3(xx)
output = self.l4(output)
output = self.l5(output)
output = self.l6(output)
return output
# Create the generator
netG = Generator().to(device)
# to mean=0, stdev=0.2.
netG.apply(weights_init)
#print(netG)
fixed_noise = torch.randn(64, nz, 1, 1, device=device)
fixed_noise_Se = torch.randn(4500, nz, 1, 1, device=device)
#-----------Discriminator ------------------
class Discriminator(nn.Module):
def __init__(self, ngpu):
super(Discriminator, self).__init__()
self.ngpu = ngpu
self.l1= nn.Sequential(nn.Conv2d(1, int(ndf/2), 4, 2, 1, bias=False),nn.LeakyReLU(0.2, inplace=True))
self.l2= nn.Sequential(nn.Conv2d(168, int(ndf/2), 4, 2, 1, bias=False),nn.LeakyReLU(0.2, inplace=True))
self.l3=nn.Sequential(nn.Conv2d(ndf, ndf * 2, 4, 2, 1, bias=False),nn.BatchNorm2d(ndf * 2),nn.LeakyReLU(0.2, inplace=True))
self.drop_out3 = nn.Dropout(0.5)
self.l4= nn.Sequential(nn.Conv2d(ndf * 2, ndf * 4, 4, 2, 1, bias=False), nn.BatchNorm2d(ndf * 4), nn.LeakyReLU(0.2, inplace=True))
self.drop_out4 = nn.Dropout(0.5)
self.l5= nn.Sequential(nn.Conv2d(ndf * 4, 1, 4, 2, 1, bias=False),nn.Sigmoid())
def forward(self, input,Volume):
x =self.l1(input)
y = self.l2(Volume)
out = torch.cat([x, y], 1)
out = self.l3(out)
out=self.drop_out3(out)
out=self.l4(out)
out=self.drop_out4(out)
out=self.l5(out)
return out
# Create the Discriminator
netD = Discriminator(ngpu).to(device)
# Apply the weights_init function to randomly initialize all weights
netD.apply(weights_init)
# Setup Adam optimizers for both G and D
optimizerD = optim.Adam(netD.parameters(), lr=lr1, betas=(beta1, 0.999))
optimizerG = optim.Adam(netG.parameters(), lr=lr2, betas=(beta1, 0.999))
# Training Loop
print("Starting Training Loop...")
# For each epoch
for epoch in range(num_epochs):
# For each batch in the dataloader
for pos, neg in zip(trainloader,trainloaderNeg):
images1,labels,Volumes=pos
images1=images1.float()
Volumes=Volumes.long()
Negpach=neg
Negpach=Negpach.float()
############################
# (1) Update D network: maximize log(D(x)) + log(1 - D(G(z)))
###########################
## Train with all-real batch
netD.zero_grad()
# ----------------Format batch inputs ----------
real_cpu = images1.to(device)
# ------- to add volumes as condition------------
Real_volume=Volumes.to(device).long().squeeze(1)
Real_volume=Real_volume.type(torch.LongTensor)
# ----------------label for Dis1 ------------
b_size = real_cpu.size(0)
label = torch.full((b_size,), real_label, device=device)
label=label.to(device)
# -------- pass the Volumes (condition) to the discriminator -------------
real_y = torch.zeros(batch_size, N_Class)
real_y = real_y.scatter_(1, Real_volume.view(batch_size, 1), 1).view(batch_size, N_Class, 1, 1).contiguous()
real_y = Variable(real_y.expand(-1, -1, img_size, img_size)).to(device)
netD=netD.float()
# --- apply dis on conditions --------------
output = netD(real_cpu,real_y).view(-1)
# Calculate loss on all-real batch
errD_real = criterion(output, label)
# Calculate gradients for D in backward pass
errD_real.backward()
## Train with all-fake batch
# ------------Generate batch of latent vectors
noise = torch.randn(b_size, nz, 1, 1, device=device)
# --------- Generate fake image batch with G and pass the condition ---------
netG=netG.float()
gen_labels = (torch.rand(batch_size, 1) * N_Class).type(torch.LongTensor)
gen_y = torch.zeros(batch_size, N_Class)
gen_y = Variable(gen_y.scatter_(1, gen_labels.view(batch_size, 1), 1).view(batch_size, N_Class,1,1)).to(device)
fake = netG(noise,gen_y)
#------------- multiply with negative patch ---------------
fake44=torch.mul(fake,Negpach)
# ---- labels for Dis for fake as input---------------
label.fill_(fake_label)
label=label.to(device)
# --------- pass condition for dis---------
gen_y_for_D = gen_y.view(batch_size, N_Class, 1, 1).contiguous().expand(-1, -1, img_size, img_size)
output = netD(fake44.detach(),gen_y_for_D).view(-1)
# Calculate D's loss on the all-fake batch
errD_fake = criterion(output, label)
# Calculate the gradients for this batch
errD_fake.backward()
errD = errD_real + errD_fake
# Update D
optimizerD.step()
############################
# (2) Update G network: maximize log(D(G(z)))
###########################
netG.zero_grad()
# ---- labels for update G--------------
label.fill_(real_label) # fake labels are real for generator cost
# label=torch.mul(label,0.9)
label=label.to(device)
# ----- apply dis on G with real label -----------
output = netD(fake44.detach(),gen_y_for_D).view(-1)
# Calculate G's loss based on this output
errG = criterion(output, label)
# Calculate gradients for G
errG.backward()
# Update G
optimizerG.step()
if epoch % 1 == 0:
with torch.no_grad():
fake = netG(fixed_noise,gen_y).detach().cpu()
plt.close("all")
plt.figure()
plt.figure(figsize=(8,8))
plt.axis("off")
plt.title("Fake Images epoch "+str(epoch))
plt.imshow(np.transpose(vutils.make_grid(fake.detach().to(device)[:64], padding=2, normalize=True,range=(0.2,1)).cpu(),(1,2,0)))
plt.savefig(os.path.join(root_dirDurringTraining13+'/'+'Epoch='+str(epoch)+'Seed='+str(manualSeed))+'fakesor2.jpg')
fake55=torch.mul(fake,Negpach)
plt.close("all")
plt.figure()
plt.figure(figsize=(8,8))
plt.axis("off")
plt.title("Fake Images epoch "+str(epoch))
plt.imshow(np.transpose(vutils.make_grid(fake55.detach().to(device)[:64], padding=2, normalize=True,range=(0,.5)).cpu(),(1,2,0)))
plt.savefig(os.path.join(root_dirDurringTraining13+'/'+'Epoch='+str(epoch))+'fakesmul55.jpg')
torch.save(netG.state_dict(), '%s/netG_epoch_%d.pth' % (root_dirDurringTraining15, epoch))
torch.save(netD.state_dict(), '%s/netD_epoch_%d.pth' % (root_dirDurringTraining15, epoch))
Unfortunately, I cannot be really useful here, as I’m not a GAN expert.
Generally, I would recommend to take a look at similar architectures of conditional GANs and try to have a look at the tricks, which were used to make the training converge.
Hi Ptrblck,
I have a question from DCGAN.Would you please tell me what exactly happen in " errD = errD_real + errD_fake" and what is the relation to the optimizer, because in the previous lines the gradient was computed two times , how the OptimizedD understand to get (errD) to optimize?
netD.zero_grad()
# Format batch
real_cpu = data[0].to(device)
b_size = real_cpu.size(0)
label = torch.full((b_size,), real_label, dtype=torch.float, device=device)
# Forward pass real batch through D
output = netD(real_cpu).view(-1)
# Calculate loss on all-real batch
errD_real = criterion(output, label)
# Calculate gradients for D in backward pass
errD_real.backward()
D_x = output.mean().item()
## Train with all-fake batch
# Generate batch of latent vectors
noise = torch.randn(b_size, nz, 1, 1, device=device)
# Generate fake image batch with G
fake = netG(noise)
label.fill_(fake_label)
# Classify all fake batch with D
output = netD(fake.detach()).view(-1)
# Calculate D's loss on the all-fake batch
errD_fake = criterion(output, label)
# Calculate the gradients for this batch
errD_fake.backward()
D_G_z1 = output.mean().item()
# Add the gradients from the all-real and all-fake batches
errD = errD_real + errD_fake
# Update D
optimizerD.step()
```
The gradients will be calculated by
errD_real.backward()
errD_fake.backward()
not by errD
, as .backward()
is never used on it in the code snippet.
I guess errD
is just used to print the sum of the real and fake loss for the discriminator.