Transpose of Linear Layer

varunagrawal · January 17, 2018, 1:13am

I am trying to implement a model that projects a vector to a fixed lower dimension and then after passing it through an LSTM and some other layers, performs the inverse with the same Linear layer.

To be more precise, we perform the following operations:

y = W * x (as a Linear layer)
...
(perform some processing on x to get k)
...
output = W' * k (how to do this with a Linear layer?)

As you can see, the first and last operation use the same layer weights. Can anyone let me know what is the correct way to do this in PyTorch?

Kaixhin · January 17, 2018, 3:33am

There’s multiple solutions. The simplest is to literally use the matrix multiply operation as you did here, applying it to a Variable and the transpose of the same Variable. If you set requires_grad=True on the Variable you’ll also be able to get back gradients to use in training.

varunagrawal · January 17, 2018, 10:41pm

Is it possible to do with Linear layers? Your solution seems to rely on using a Variable weight matrix.

smth · January 17, 2018, 11:14pm

m = nn.Linear(100, 200)
x = Variable(torch.randn(20, 100))

y = m(x)
...
(perform some processing on x to get k)
...
output = torch.matmul(m.weight.t(), k)

varunagrawal · January 17, 2018, 11:22pm

Awesome! And I assume autograd will take care of applying the gradients correctly?

smth · January 17, 2018, 11:40pm

yes. […]

varunagrawal · January 17, 2018, 11:42pm

Sweet! Thanks Soumith.