Can't optimize single parameter Tensor

patrickctrf · June 25, 2022, 10:52pm

I’m debugging my GAN training script because my generator does not learn anything. I’m using a Subset of my dataset with a single sample, so my generator should learn a constant output equal to that sample. But it doesn’t.

# Train Data
train_dataset = NsynthDatasetFourier(path="nsynth-train/", noise_length=noise_length)
train_dataset = Subset(train_dataset, [0, ])  # dummy dataset for testing script
train_dataloader = DataLoader(train_dataset, batch_size=batch_size, shuffle=True, num_workers=1)

Then I decided to create a dummy_generator that only outputs a Parameter Tensor saved as instance object. So it definitely should learn that tensor to be equal to the only sample um my dataset, but that tensor never changes.

class DummyGenerator(nn.Module):
    def __init__(self, *args, **kwargs):
        super().__init__()

        self.dummy_tensor = nn.Parameter(torch.rand((1, 2, 1024, 128), requires_grad=True))

    def forward(self, x):
        return self.dummy_tensor

Is my approach correct? Neither my original Generator or my DummyGenerator seems to learn, no matter how big is my learning rate. My discriminator seems to learn fine.

If anybody can help me, I would be very grateful. My complete code is here.

P.S.: It is necessary to download Nsynth dataset to run the complete code.

ptrblck · June 26, 2022, 12:53am

The use case is a bit strange, but should generally work as seen in this small example overfitting a static target:

class DummyGenerator(nn.Module):
    def __init__(self, *args, **kwargs):
        super().__init__()

        self.dummy_tensor = nn.Parameter(torch.rand((1, 2, 1024, 128), requires_grad=True))

    def forward(self):
        return self.dummy_tensor
    
model = DummyGenerator()
optimizer = torch.optim.Adam(model.parameters(), lr=1e-3)
criterion = nn.MSELoss()

y = torch.randn(1, 2, 1024, 128)

for epoch in range(10000):
    optimizer.zero_grad()
    out = model()
    loss = criterion(out, y)
    loss.backward()
    optimizer.step()
    print('epoch {}, loss {}'.format(epoch, loss.item()))

# ...
# epoch 9998, loss 1.416539715387577e-11
# epoch 9999, loss 1.4069788736859046e-11

Check if your parameter has a valid .grad attribute after the backward call and is indeed being updated.

patrickctrf · June 26, 2022, 9:19pm

Thanks for your response. I tried a case similar to your example and changed my generator loss to minimize the difference from the target, instead of aiming discriminator’s output:

generator_discriminator_out = discriminator(generated_data)
generator_loss = criterion(generated_data, target)
# generator_loss = criterion(generator_discriminator_out, true_labels)

The generator successfully accomplishes this task and the discriminator’s output becomes 0.5 for every input (this is expected).

But when I go back to the original loss for GANs (trying to be approved by the discriminator), the output from my generator doesn’t match the sample:

generator_discriminator_out = discriminator(generated_data)
# generator_loss = criterion(generated_data, target)
generator_loss = criterion(generator_discriminator_out, true_labels)

This is an indicator that the discrimination is not learning the correct target, right?

Extra info: When I train with the whole dataset again, generator gets stuck with constant output for every input (and MSE loss goes to 1, the maximum value). I don’t know if it’s relevant information, but I don’t understand why my generator would stop in bad local minima like that.