Custom upscaling of an image through a pretrained decoder network

Joshua_Clancy · February 3, 2020, 2:42pm

Hello! So this is a bit of weird one…

I have a (b, c, 7, 7) tensor. For each “pixel” in this tensor I am going to put the (c, 1, 1) through a pretrained decoder to produce a (1, 4, 4) pixel image. In this way I intend to turn the (batch, channel, 7, 7) into (batch, 1, 28, 28). The problem is that I can not figure out how to do this upscaling through my custom function (the pretrained decoder).

perhaps an image will help:

To be clear I already have the trained function/decoder, I can feed the pixels into that decoder one by one and get the resulting 4 by 4. But I need to figure out a way to unfold the resulting tensors back into the larger image shape. (Perhaps some sort of nested unfold.) That is what I am having trouble with. If you could help out or simply point me in the right direction I would be very grateful.

Thanks in advance!

Joshua_Clancy · February 3, 2020, 4:51pm

#%%
b = 1
c = 6
x_in = 7
y_in = 7

# place holder tensor
x = torch.rand((b, c, x_in, y_in), requires_grad=False) # 1, 6, 7, 7
x = x.view((b*x_in*y_in, 6)) # 49, 6
new =  decoder(x, outputDict)
print(new.shape) # 49, 1, 4, 4
new = new.view((b, 1, x_in*4, y_in*4))
print(new.shape) # 1, 1, 28, 28

This is where I am at now. Though I am not sure whether these reshapes will keep the pixels in there respective places…

ptrblck · February 4, 2020, 5:38am

The first view operation might interleave the values, so you should permute the channel dimension to the last dimension first:

x = torch.rand((b, c, x_in, y_in), requires_grad=False)
x = x.permute(0, 2, 3, 1).view(b*x_in*y_in, 6)

I’m not sure how the output is calculated, so it’s hard to tell, if the last view is working correctly.