I have a pretrained GAN generator, that takes latent vectors from either Z space, shaped [1,512], or from W space shaped [1,18,512], and outputs face images of of dimension [3,1024,1024].
I also have a classifier that gives the prediction scores for 40 binary facial attributes.
My goal is to iteratively modify an input image in the direction of editing a single facial attribute. The edit wouldn’t be directly on the image, but un the latent vector which will be the input of the GAN generator that will create the image. For example i want to obtain the same face but with “Mustache”, or “Smiling”. The way i do it right now is the following:
# Generate an image and its correspondant latent vector
attribute = "Smiling"
z = torch.randn(1,18,512, requires_grad=True).cuda()
z.retain_grad()
image, w = generator(z, input_is_style=False, return_styles=True, randomize_noise=False)
w.retain_grad()
# Get the prediction score of 'attribute'
predictions = classifier(image) # shape 1,80
predictions.retain_grad()
predictions = predictions.view(-1,40,2) # shape 1,40,2
predictions = torch.softmax(predictions, dim=2) # shape 1,40,2
predictions = predictions.squeeze(0)[:,1].unsqueeze(1) # shape 1,40
prediction = predictions[attributes.index(attribute)] # shape 1
i = 0
lr = 1e-2
target1 = torch.tensor([1.0], requires_grad=True).cuda()
while prediction.item() < 0.8 and i<100:
print(f"\rIter: {i} - prediction: {prediction.item():.3f}, end='')
predictions = classifier(image) # shape 1,80
predictions.retain_grad()
predictions = predictions.view(-1,40,2) # shape 1,40,2
predictions = torch.softmax(predictions, dim=2) # shape 1,40,2
predictions = predictions.squeeze(0)[:,1].unsqueeze(1) # shape 1,40
prediction = predictions[attributes.index(attribute)] # shape 1
# Compute the loss of my prediction with respect to a perfect prediciton of 1
loss = torch.nn.functional.binary_cross_entropy(prediction, target1)
loss.backward(retain_graph=True)
# Update the vector in the gradient direction
w -= lr * w.grad
# Generate the new image from the modified latent vector w
image, _= generator(w, randomize_noise=False, input_is_style=True, return_styles=True)
i += 1
tensor2im(image.squeeze(0)).resize((256,256)).show()
This code snippet works, it doesn’t give any errors, but the modifications aren’t that evident nor clear.
Do you think I’m doing something wrong gradient wise in particular? Thanks in advance.