I’m trying to implement a deep Autoencoder in PyTorch where the encoder’s weights are tied to the decoder. Following the idea given here, [Autoencoder with tied weights using sequential() - #3 by TheOraware] on this forum, I came up with this:

```
class TiedAutoEncoder(nn.Module):
def __init__(self):
super().__init__()
self.w1 = nn.Parameter(torch.randn(100, 100))
self.w2 = nn.Parameter(torch.randn(100, 100))
self.w3 = nn.Parameter(torch.randn(100, 100))
self.w4 = nn.Parameter(torch.randn(100, 100))
self.w5 = nn.Parameter(torch.randn(100, 100))
self.w6 = nn.Parameter(torch.randn(100, 100))
self.w7 = nn.Parameter(torch.randn(100, 100))
self.w8 = nn.Parameter(torch.randn(100, 100))
def forward(self, input):
### INPUT
x = torch.tanh(F.linear(input, nn.Parameter(torch.randn(100, 39))))
### ENCODER
x = torch.tanh(F.linear(x, self.w1))
x = torch.tanh(F.linear(x, self.w2))
x = torch.tanh(F.linear(x, self.w3))
x = torch.tanh(F.linear(x, self.w4))
x = torch.tanh(F.linear(x, self.w5))
x = torch.tanh(F.linear(x, self.w6))
x = torch.tanh(F.linear(x, self.w7))
x = torch.tanh(F.linear(x, self.w8))
### FEATURE EXTRACTION LAYER
fe = torch.tanh(F.linear(x, nn.Parameter(torch.randn(39, 100))))
### DECODER
x = torch.tanh(F.linear(fe, nn.Parameter(torch.randn(100, 39))))
x = torch.tanh(F.linear(x, self.w8.T))
x = torch.tanh(F.linear(x, self.w7.T))
x = torch.tanh(F.linear(x, self.w6.T))
x = torch.tanh(F.linear(x, self.w5.T))
x = torch.tanh(F.linear(x, self.w4.T))
x = torch.tanh(F.linear(x, self.w3.T))
x = torch.tanh(F.linear(x, self.w2.T))
x = torch.tanh(F.linear(x, self.w1.T))
### OUTPUT
out = F.linear(x, nn.Parameter(torch.randn(39, 100)))
return out
```

I plan on training this autoencoder and then using the bottleneck layer (labelled feature extraction in the code) to extract features for my data. Basically I’ll remove the decoder part after training.

The problem is this model does not train; the loss remains stuck. I’ve tried increasing the data as well as using different activation functions. I think the problem is in how I’ve written the weight sharing, since the same model without tying does train.

Any ideas on what I could be doing wrong will be appreciated. Thank you!

(Autoencoder with tied weights using sequential() - #3 by TheOraware)