Deep Contractive Autoencoder

Hi All,

I am trying to implement Contractive Autoencoder which require gradient in loss function.
When the getting jacobian is not as easy as getting the weights in multi-layer encoder, which option would you recommend?

Option 1:
optim_rec = SGD([enc.parameters(), dec.parameters()])
optim_con = SGD(enc.parameters())

# reconstruction loss
y = enc(x)
x_ = dec(y)
loss = smoothL1(x, x_)
optim_rec.zero_grad()
loss.backward()
optim_rec.step()

# contractive loss
x.requires_grad = True
y = enc(x)
optim_con.zero_grad()
y.backward(ones(y.size()))
x.grad.requires_grad = True
x.grad.volatile = False
loss = mean(pow(x.grad, 2))
optim_con.zero_grad()
loss.backward()
optim_con.step()

Option 2
optim = SGD([enc.parameters(), dec.parameters()])

# losses
x.requires_grad = True
y = enc(x)
x_ = dec(y)
optim.zero_grad()
y.backward(ones(y.size()), retain_graph=True)
x.grad.requires_grad = True
x.grad.volatile = False
loss = [mean(pow(x.grad, 2)), smoothL1(x, x_)]
optim.zero_grad()
sum(loss).backward()
optim.step()

As recommeded in Autograd document, option 1 refrained from using retain_graph but requires 2 separate forward passes and optimizers. Both options work for me but I would like to know if there are more elegant methods to use Autograd.

Thankss.:grinning:

4 Likes

One way is to use a stacked AE and train layer by layer and include the jacobian norm in each loss function

Are you sure this method works? because I dont see any Forbineous norm calculation done here!
Also I noticed using your method, the loss goes haywire! the reconstruction works regardless of the second loss, so I’d be very grateful if you could explain this and share with us the code you used for multi-layerd CAE.