How to run autoencoder on single image/sample for inference

sparshgarg23 · January 18, 2022, 3:02pm

I have trained an autoencoder and the training results seem to be okay.
But when i run the model on a single image,the generated results are incosistent.
Any ideas on how I can run the autoencoder on a single example.
The autoencoder model in my case accepts an input of dimension (256x256+3,1)
My evaluation code is as follows

img_dir='blender_files/Model_0/image_0/model_0_0.jpg'
img=cv2.resize(cv2.imread(img_dir,0),(256,256))#the image of dimension 256x256
img=torch.from_numpy(img)
#print(img)
coord=np.array([0,0,0]) #3x1 vector which i need to predict using the autoencoder
coord=torch.from_numpy(coord)
img=torch.unsqueeze(torch.flatten(img),1)
#print(img)
coord=torch.unsqueeze(torch.flatten(coord),1)
X=torch.cat((img,coord),dim=0)#the input feature which i am feeding to the model ,whose size is ```torch.Size([65539])```



#Feed feature vector to model
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
features=X.to(device)
features=torch.flatten(features)
print(features.shape)
model=AutoEncoder(num_features=features.shape[0],num_hidden_1=num_hidden_1,num_hidden_2=num_hidden_2,num_hidden_3=num_hidden_3)
model=model.double()
model.load_state_dict(torch.load('blender_files/Model_0/image_0/saved_model_img.pth'))
model.eval()
torch.manual_seed(123)
with torch.no_grad():
    decoded_rep=model(features[None,...])
print(decoded_rep)

The output i am getting is as follows

tensor([[3.2402e+16, 3.4111e+16, 3.2839e+16,  ..., 4.4640e+16, 4.4089e+16,
         7.7656e+15]], dtype=torch.float64)

But the decoded output obtained during training is

[ 67.5205,  67.6745,  67.6265,  ..., 124.9578, 124.8637,   4.7602]

As you can see both outputs are not even close to one another.
I was looking at vanilla autoencoders and it seems for generation purposes,they are not really a good choice,as such what other models can I use .
Since in my case I am interested in predicting the last 3 values of the feature ,would an autoregressive model suit my case better.

The following also adds more weight to my point
One common application done with autoregressive models is auto-completing an image. As autoregressive models predict pixels one by one, we can set the first N pixels to predefined values and check how the model completes the image

https://uvadlc-notebooks.readthedocs.io/en/latest/tutorial_notebooks/tutorial12/Autoregressive_Image_Modeling.html