For debugging purpose, I would to try to remove the encoder part in autoencoder() and attach NewModel() to autoencoder() as the new encoder, train only the new attached encoder part while the trained decoder stays freezed, as you did here, and see what changes.
So I trained an autoencoder (AE_model) on a dataset of Domain B. I want to train another model (new_model) that takes in dataset of Domain A, and output predictions in Domain B.
First, I encode the dataset of Domain B using (AE_model.encoder).
Then, my idea was to have new_model to predict encoded dataset of Domain B, then have AE_model.decoder to decode back to the original state.
class combined(nn.Module):
def __init__(self):
super().__init__()
self.AE_model = autoencoder()
self.AE_model.load_state_dict(torch.load('some_weights.pth'))
#####Freezing all weights in AE
for param in self.AE_model.parameters():
param.requires_grad = False
#### New Model
self.block = nn.Sequential ( )
def forward(self, x):
x = self.block(x)
x = self.AE_model.decoder(x)
return x
The training and validation loss still does not change a bit. . This is so weird, as when I do the Loss directly at the encoded prediction, it works fine. Just by adding the decoder made it fail.
Update: I may have found the problem and a solution. I think it turns out that my encoded_prediction has positive values while my new_model takes in positive and negative values. So I guess the problem is very nonlinear and the NN wasn’t able to initialize the weights properly in the encoded_prediction, thus the autoencoder.decoder is having a hard time finding a gradient of steepest descent. I fixed it by changing my ReLU to Tanh activations.