Using MaxUnpool for TransferLearning


I have a question about the use of the MaxUnpool with Pytorch.

I Made an Autoencoder (AE) symetric with 5/5 layers:
Encoder: [ConvLayer+MaxPool]*5
Decoder: [ConvLayer+MaxUnpool]*5

At the end of the encoder I achieved a reduction of 3%. Astonishingly my AE is performing perfectly. I almost have an overlap of inputs/ouputs.

It seems like the Maxunpool is a way of cheating in the learning. Keeping the indexes sounds too powerful. I believe that the Network is not learning enough and expect that the MaxUnpools are enough to reconstruct the data backward.

The final goal of my project is to perform transfer learning using the Encoder that should learn and understand the “main features” of my dataset.

Do you think that the MaxUnpool are too powerful and are not a good solution for what I am aiming?

1 Like

I’m not sure if MaxUnpool is too powerful, as the majority of the activation output should be zero, so that your model still would have to learn the other values.
Did you try to use the encoder for your other use case?
If so, did it perform poorly?

Just as a test you could try to use transposed convolutions and see, if that helps your model (especially the encoder as it seems to be the main use case).


Thank you so much for your answer. We haven’t tried yet on the other models, but we think we are going to move forward and check the performance :slight_smile: !

1 Like

@ptrblck We finally did a test that was confirming that MaxUnpool was too powerful.
We fed the encoder with an image to get the indices and then put a random vector before the decoder.

def forward(self, input):
    latent_vector, indices = self.encoder(input)
    # Here we overwrite the latent vector with a random one
    latent_vector = torch.randn(self.dim_latent_vector) 
    output = self.decoder(latent_vector, indices)
    return output

The decoder performed a successful reconstitution of the input expected anyway.
The following paper details the phenomena encountered.

Thanks for the follow-up!
That’s an interesting observation. What did you end up using?

For now we are trying a fully CNN model but it’s learning slowly so we don’t have any satisfactory results yet :slight_smile: .