Joint AE/GAN training

tymokvo · May 11, 2017, 8:16pm

Hi

I’m trying to jointly train a convolutional network as an AE and GAN but I’m not sure that I have the training routine set up correctly. Would greatly appreciate any help.

for epoch in range(n_iter):
    for i, (batch, _) in enumerate(dataloader):

        current_batch_size = batch.size(0)

        #---------------
        #Train as AE
        #---------------

        optim_encoder.zero_grad()
        optim_decoder.zero_grad()

        input = Variable(batch).cuda()

        encoded = _Encoder(input)
        encoded = encoded.unsqueeze(0)
        encoded = encoded.view(input.size(0), -1, 1, 1)
        reconstructed = _Decoder(encoded)

        reconstruction_loss = criterion(reconstructed, input)
        reconstruction_loss.backward()

        optim_encoder.step()
        optim_decoder.step()  # here it's SGD

        #---------------
        #Train as GAN
        #---------------

        #Train Discriminator on real

        _Discriminator.zero_grad()
        real_samples = input.clone()
        inference_real = _Discriminator(real_samples)
        labels = torch.FloatTensor(current_batch_size).fill_(real_label)
        labels = Variable(labels).cuda()
        real_loss = criterion(inference_real, labels)
        real_loss.backward()

        #Generate fake samples

        noise.data.resize_(current_batch_size, z_d, 1, 1)
        noise.data.uniform_(-10, 10)
        fake_samples = _Decoder(noise)

        #Train Discriminator on fake

        labels.data.fill_(fake_label)
        inference_fake = _Discriminator(fake_samples.detach())
        fake_loss = criterion(inference_fake, labels)
        fake_loss.backward()
        discriminator_total_loss = real_loss + fake_loss
        optim_discriminator.step()

        #Update Decoder/Generator with how well it fooled Discriminator

        _Decoder.zero_grad()
        labels.data.fill_(real_label)
        inference_fake_Decoder = _Discriminator(fake_samples)
        fake_samples_loss = criterion(inference_fake_Decoder, labels)
        fake_samples_loss.backward()
        optim_decoderGAN.step()  # here it's Adam

My goal is to have the Decoder/Generator map the real samples to specific locations in the latent space, and then generate potential candidates between/around the points that are mapped to real samples. Also, I know that the “stabilizing GANs” post says to use normal rather than uniform distributions and that my range is quite large. I started doing that because evaluating the trained DeeperDCGANs that I’ve been using seems to show that the usual Z.normal_(0, 1) is too small of a range, and the points are continually overwritten at each iteration. I also think that the feature space of my dataset is likely following a uniform rather than normal distribution.

Thanks!