I am trying to impose Lipchitzness using Grad Penalty like in WGAN on an encoder network. The backward on the gradient penatly term works and the model gets trained if the encoder network only consists of linear (fully connected) layers. On changing too / adding convolutional layers to the model, I get the error :
RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn
Loss and gradient penatly code:
recon_loss = F.mse_loss(X_sample, X, size_average=False) / mb_size X_recon = P2(z_dis) recon_loss_P2 = F.mse_loss(X_recon, X, size_average=False) / mb_size loss = recon_loss + recon_loss_P2 # gradient penalization (effectively, second order derivative) gradQ = grad(z_con.mean(), X, create_graph=True) gradQ0 = gradQ gradQ_norm = gradQ0.norm() gradient_penalty = (gradQ_norm - 0.0).pow(2) loss.backward(retain_graph=True) gradient_penalty.backward() #this line gives the error
class Flatten(nn.Module): def forward(self, input): return input.view(input.size(0), -1) Q = nn.Sequential( nn.Conv2d(1, 1, 4, stride=2, padding=1), # it works on removing this line of code Flatten(), nn.Linear(784, 10) )
Why might this be happening? I am also open to imposing the gradient penalty in another way if that works with convolutional layers.