How to create an artificial index for MaxUnpool2D

I am currently working on an asymmetric autoencoder (encoder and decoder have different architectural designs), and would like to use MaxUnpool2D for the decoder. However, the module requires an ‘indices’ argument, which I cannot obtain from the encoder-side. I am wondering how do I efficiently create an artificial ‘indices’? Furthermore, if the ‘indices’ creation process can be done batchwise, that would be even better.

Thanks in advance.

Take a look at this example:

pool = nn.MaxPool2d(2, return_indices=True)

x = torch.arange(2*16).view(2, 1, 4, 4).float()
print(x)
# tensor([[[[ 0.,  1.,  2.,  3.],
#           [ 4.,  5.,  6.,  7.],
#           [ 8.,  9., 10., 11.],
#           [12., 13., 14., 15.]]],


#         [[[16., 17., 18., 19.],
#           [20., 21., 22., 23.],
#           [24., 25., 26., 27.],
#           [28., 29., 30., 31.]]]])

out, idx = pool(x)
print(out)
# tensor([[[[ 5.,  7.],
#           [13., 15.]]],


#         [[[21., 23.],
#           [29., 31.]]]])

print(idx)
# tensor([[[[ 5,  7],
#           [13, 15]]],


#         [[[ 5,  7],
#           [13, 15]]]])

unpool = nn.MaxUnpool2d(2)
y = unpool(out, idx)
print(y)
# tensor([[[[ 0.,  0.,  0.,  0.],
#           [ 0.,  5.,  0.,  7.],
#           [ 0.,  0.,  0.,  0.],
#           [ 0., 13.,  0., 15.]]],


#         [[[ 0.,  0.,  0.,  0.],
#           [ 0., 21.,  0., 23.],
#           [ 0.,  0.,  0.,  0.],
#           [ 0., 29.,  0., 31.]]]])

to understand what the returned indices from an nn.MaxPool2d layer represent and how you could create “artificial indices”.
I also don’t fully understand what your use case would be if you want to randomly create these indices.

1 Like

Thanks. That helps.

For the use case, I am working with an asymmetric autoencoder. Specifically, the encoder is a pretrained black-box (of some unknown architecture) that generates a latent vector for each input image. I need to design an decoder that can convert the latent vector into a reconstructed image. Based on previous knowledge, MaxUnpool2D and ConvTranspose2D can be used in the decoder design.

However, if there are better solutions (papers, tutorial, etc.) that avoids constructing an artificial idx, please let me know.