Deep Contractive Autoencoder

Hi All,

I am trying to implement Contractive Autoencoder which require gradient in loss function.
When the getting jacobian is not as easy as getting the weights in multi-layer encoder, which option would you recommend?

Option 1:
optim_rec = SGD([enc.parameters(), dec.parameters()])
optim_con = SGD(enc.parameters())

# reconstruction loss
y = enc(x)
x_ = dec(y)
loss = smoothL1(x, x_)
optim_rec.zero_grad()
loss.backward()
optim_rec.step()

# contractive loss
x.requires_grad = True
y = enc(x)
optim_con.zero_grad()
y.backward(ones(y.size()))
x.grad.requires_grad = True
x.grad.volatile = False
loss = mean(pow(x.grad, 2))
optim_con.zero_grad()
loss.backward()
optim_con.step()

Option 2
optim = SGD([enc.parameters(), dec.parameters()])

# losses
x.requires_grad = True
y = enc(x)
x_ = dec(y)
optim.zero_grad()
y.backward(ones(y.size()), retain_graph=True)
x.grad.requires_grad = True
x.grad.volatile = False
loss = [mean(pow(x.grad, 2)), smoothL1(x, x_)]
optim.zero_grad()
sum(loss).backward()
optim.step()

As recommeded in Autograd document, option 1 refrained from using retain_graph but requires 2 separate forward passes and optimizers. Both options work for me but I would like to know if there are more elegant methods to use Autograd.

Thankss.:grinning:

4 Likes

One way is to use a stacked AE and train layer by layer and include the jacobian norm in each loss function

Are you sure this method works? because I dont see any Forbineous norm calculation done here!
Also I noticed using your method, the loss goes haywire! the reconstruction works regardless of the second loss, so Iā€™d be very grateful if you could explain this and share with us the code you used for multi-layerd CAE.