I am trying to implement a model that projects a vector to a fixed lower dimension and then after passing it through an LSTM and some other layers, performs the inverse with the same Linear layer.
To be more precise, we perform the following operations:
y = W * x (as a Linear layer)
...
(perform some processing on x to get k)
...
output = W' * k (how to do this with a Linear layer?)
As you can see, the first and last operation use the same layer weights. Can anyone let me know what is the correct way to do this in PyTorch?
There’s multiple solutions. The simplest is to literally use the matrix multiply operation as you did here, applying it to a Variable and the transpose of the same Variable. If you set requires_grad=True on the Variable you’ll also be able to get back gradients to use in training.
m = nn.Linear(100, 200)
x = Variable(torch.randn(20, 100))
y = m(x)
...
(perform some processing on x to get k)
...
output = torch.matmul(m.weight.t(), k)