How to make same input/output for autoencoder?

I want to use autoencoder’s latent vector as feature extractor. For that, I want to put same input/output image for the autoencoder model.

I want to make “input_size” in both encoder & decoder as a same image. Current model’s result is not a fixed. it just assigned the size of the output space.

import torch
import torch.nn as nn

# Define the autoencoder architecture
class Autoencoder(nn.Module):
    def __init__(self, input_size, latent_dim):
        super(Autoencoder, self).__init__()

        # Encoder
        self.encoder = nn.Sequential(
            nn.Linear(input_size, 64),
            nn.ReLU(),
            nn.Linear(64, 32),
            nn.ReLU(),
            nn.Linear(32, latent_dim)
        )

        # Decoder
        self.decoder = nn.Sequential(
            nn.Linear(latent_dim, 32),
            nn.ReLU(),
            nn.Linear(32, 64),
            nn.ReLU(),
            nn.Linear(64, input_size)
        )

    def forward(self, x):
        x = self.encoder(x)
        x = self.decoder(x)
        return x

I don’t fully understand your question as your code already shows the reuse of input_size making sure the output activation has the same shape as the input.

Hi, thanks for the reply.

What I want to do is not the fix the shape, but fix the data value.

As you mentioned, “input_size” in decoder only fix the size of the output, not the value.

I want to use exactly same data with input as the output of the model.

Thanks!

In my understanding, below is your task.

I want to use autoencoder’s latent vector as feature extractor

But, the approach that you mention does not seem to be right:

For that, I want to put same input/output image for the autoencoder model.

If you need to train your autoencoder to get the latent feature vector from your input image, you need to then train the decoder to reconstruct the original image itself.
For this you need to compare the x output from the decoder to be same as the input image. This is where loss function comes into picture.

Simple approach is to compute L2 loss between the input image and the output from the decoder.

For extraction of latent features you could use the output of the encoder, once you finish training the model.

Understood.
Thanks for correcting me.

1 Like