Training a model split between two classes


(Charlie) #1

I just want to make sure that what I’m doing is correct because I could not find the same question asked before, but if I structure my model as below, with the encoder as a separate module used within the autoencoder module:

class Encoder(nn.Module):
    def __init__(self):
        super(Encoder, self).__init__()

        self.encoder = nn.Sequential(
            nn.Linear(136, 136),
            nn.ReLU(),
            nn.Linear(136, 1),
        )

    def forward(self, x):
        return self.encoder(x)


class Autoencoder(nn.Module):
    def __init__(self, encoder_model):
        super(Autoencoder, self).__init__()

        self.encoder = encoder_model
        self.decoder = nn.Sequential(
            nn.Linear(1, 136),
            nn.ReLU(),
            nn.Linear(136, 136),
        )

    def forward(self, x):
        encoded = self.encoder(x)
        return self.decoder(encoded)

And then set up a model with:

encoder_model = Encoder().to(device)

unsupervised_model = Autoencoder(encoder_model).to(device)
unsupervised_criterion = nn.MSELoss()
unsupervised_optimizer = torch.optim.Adam(unsupervised_model.parameters(), lr=args.learning_rate)

When I perform unsupervised_optimizer.step() the encoder part of the autoencoder will be trained as well right, because that should be part of the computational graph?


(Vahid Mirjalili) #2

Yes, the backpropagation works for both encoder and decoder. You can test that by looking at the list of paramaters in unsupervised_model.parameters():

p = list(unsupervised_model.parameters())
for w in p:
    print(w.shape)

which will print the following:

torch.Size([136, 136])
torch.Size([136])
torch.Size([1, 136])
torch.Size([1])
torch.Size([136, 1])
torch.Size([136])
torch.Size([136, 136])
torch.Size([136])

and these are all the parameters for both encoder and decoder.

Although, it’s a bit unusual to pass encoder to the autoencoder. You can define the encoder object inside Autoencdoer. The following code will achieve the same thing, but more readable.

class Autoencoder(nn.Module):
    def __init__(self):
        super(Autoencoder, self).__init__()

        self.encoder = Encoder()
        self.decoder = nn.Sequential(
            nn.Linear(1, 136),
            nn.ReLU(),
            nn.Linear(136, 136),
        )

    def forward(self, x):
        encoded = self.encoder(x)
        return self.decoder(encoded)

Then, creating an instance of the model will be with just one call:

unsupervised_model = Autoencoder().to(device)

(Charlie) #3

Thank you very much for the reply. The reason I was doing it this way was so that I could easily access the encoder model once it was trained for further supervised training. Is it better to structure my model as you have and then at the end recover the supervised model with:
supervised_model = unsupervised_model.encoder?


(Vahid Mirjalili) #4

Sure, no problem.

Yes, you can still do that. After you train your autoencoder, you can work with the submodule unsupervised_model.encoder separately.