I have trained an autoencoder and the training results seem to be okay.
But when i run the model on a single image,the generated results are incosistent.
Any ideas on how I can run the autoencoder on a single example.
The autoencoder model in my case accepts an input of dimension (256x256+3,1)
My evaluation code is as follows
img_dir='blender_files/Model_0/image_0/model_0_0.jpg' img=cv2.resize(cv2.imread(img_dir,0),(256,256))#the image of dimension 256x256 img=torch.from_numpy(img) #print(img) coord=np.array([0,0,0]) #3x1 vector which i need to predict using the autoencoder coord=torch.from_numpy(coord) img=torch.unsqueeze(torch.flatten(img),1) #print(img) coord=torch.unsqueeze(torch.flatten(coord),1) X=torch.cat((img,coord),dim=0)#the input feature which i am feeding to the model ,whose size is ```torch.Size()``` #Feed feature vector to model device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu") features=X.to(device) features=torch.flatten(features) print(features.shape) model=AutoEncoder(num_features=features.shape,num_hidden_1=num_hidden_1,num_hidden_2=num_hidden_2,num_hidden_3=num_hidden_3) model=model.double() model.load_state_dict(torch.load('blender_files/Model_0/image_0/saved_model_img.pth')) model.eval() torch.manual_seed(123) with torch.no_grad(): decoded_rep=model(features[None,...]) print(decoded_rep)
The output i am getting is as follows
tensor([[3.2402e+16, 3.4111e+16, 3.2839e+16, ..., 4.4640e+16, 4.4089e+16, 7.7656e+15]], dtype=torch.float64)
But the decoded output obtained during training is
[ 67.5205, 67.6745, 67.6265, ..., 124.9578, 124.8637, 4.7602]
As you can see both outputs are not even close to one another.
I was looking at vanilla autoencoders and it seems for generation purposes,they are not really a good choice,as such what other models can I use .
Since in my case I am interested in predicting the last 3 values of the feature ,would an autoregressive model suit my case better.
The following also adds more weight to my point
One common application done with autoregressive models is auto-completing an image. As autoregressive models predict pixels one by one, we can set the first N pixels to predefined values and check how the model completes the image