I’m trying to jointly train a convolutional network as an AE and GAN but I’m not sure that I have the training routine set up correctly. Would greatly appreciate any help.
for epoch in range(n_iter): for i, (batch, _) in enumerate(dataloader): current_batch_size = batch.size(0) #--------------- #Train as AE #--------------- optim_encoder.zero_grad() optim_decoder.zero_grad() input = Variable(batch).cuda() encoded = _Encoder(input) encoded = encoded.unsqueeze(0) encoded = encoded.view(input.size(0), -1, 1, 1) reconstructed = _Decoder(encoded) reconstruction_loss = criterion(reconstructed, input) reconstruction_loss.backward() optim_encoder.step() optim_decoder.step() # here it's SGD #--------------- #Train as GAN #--------------- #Train Discriminator on real _Discriminator.zero_grad() real_samples = input.clone() inference_real = _Discriminator(real_samples) labels = torch.FloatTensor(current_batch_size).fill_(real_label) labels = Variable(labels).cuda() real_loss = criterion(inference_real, labels) real_loss.backward() #Generate fake samples noise.data.resize_(current_batch_size, z_d, 1, 1) noise.data.uniform_(-10, 10) fake_samples = _Decoder(noise) #Train Discriminator on fake labels.data.fill_(fake_label) inference_fake = _Discriminator(fake_samples.detach()) fake_loss = criterion(inference_fake, labels) fake_loss.backward() discriminator_total_loss = real_loss + fake_loss optim_discriminator.step() #Update Decoder/Generator with how well it fooled Discriminator _Decoder.zero_grad() labels.data.fill_(real_label) inference_fake_Decoder = _Discriminator(fake_samples) fake_samples_loss = criterion(inference_fake_Decoder, labels) fake_samples_loss.backward() optim_decoderGAN.step() # here it's Adam
My goal is to have the Decoder/Generator map the real samples to specific locations in the latent space, and then generate potential candidates between/around the points that are mapped to real samples. Also, I know that the “stabilizing GANs” post says to use normal rather than uniform distributions and that my range is quite large. I started doing that because evaluating the trained DeeperDCGANs that I’ve been using seems to show that the usual Z.normal_(0, 1) is too small of a range, and the points are continually overwritten at each iteration. I also think that the feature space of my dataset is likely following a uniform rather than normal distribution.