Hello there, I can’t seem to get my input RGB image back corectly after converting it to tensor, it seems that all 9 smaller pictures differ from each other slightly.
Code to reproduce the problem
rgb_img = cv2.resize(rgb_img, dsize = (160,120),interpolation=cv2.INTER_CUBIC)
#(120, 160, 3)
rgb_img = np.rollaxis(rgb_img, 2)
#(3, 120, 160)
rgb_img = torch.from_numpy(rgb_img)
rgb_img = rgb_img.type(torch.IntTensor)
rgb_img = rgb_img.to(device="cpu")
rgb_img= rgb_img.unsqueeze(0)# this is just to reproduce the same batch dimension of dataloader
#torch.Size([1, 3, 120, 160])
rgb_img = rgb_img.reshape(rgb_img.shape, rgb_img.shape, rgb_img.shape).cpu().numpy()
Your method is almost OK. You forgot to think about the permutation of the axes.
I am giving two methods to save your image.
import numpy as np
import torchvision.utils as vutils
# reading image
rgb_img = cv2.imread('1.jpg') # read any image
rgb_img = cv2.resize(rgb_img, dsize=(160, 120), interpolation=cv2.INTER_CUBIC)
# saving original image
# using torchvision to save image <- first method
rgb_img_tensor = torch.from_numpy(rgb_img).float() / 255. # torchvision uses [0,1] range
rgb_img_tensor = rgb_img_tensor.unsqueeze(0).permute(0, 3, 1, 2)
rgb_img_tensor = rgb_img_tensor[:, [2, 1, 0], :, :] # cv2 reads image in BGR, reversing it to RGB
# using your method to save image <- second method
rgb_img_orig = np.rollaxis(rgb_img, 2)
rgb_img_orig = torch.from_numpy(rgb_img_orig)
rgb_img_orig = rgb_img_orig.type(torch.IntTensor)
rgb_img_orig = rgb_img_orig.unsqueeze(0)
shape = rgb_img_orig.shape
rgb_img_orig = rgb_img_orig.permute(0, 2, 3, 1).reshape(shape, shape, shape).cpu().numpy() # <- permute axes here
Worked like a charm =) Thanks!
I resorted at some point to transform it back to a PIL Image which was not the prettiest way, here it is anyway x.x
from PIL import Image
import torchvision.transforms.functional as TF
rgb_img = TF.to_tensor(rgb_img) # rescale it 0<..<1
rgb_img = rgb_img.type(torch.uint8).squeeze(0)
#rgb_img = rgb_img.type(torch.FloatTensor)
img = torchvision.transforms.ToPILImage()(rgb_img)
im = Image.fromarray(img)
on an another note, is it not a good an idea to train a one class detection model with RGB values or I should follow the good practice of scaling and normalizing my tensors?
Glad it worked!
I am not sure about what you mean by scaling and normalization. If you are asking about scaling the values in
[0,1] range by dividing RGB values with 255 (pixel value max), then yes, you should use scaling. It will help the model with faster convergence.
I generally don’t use normalization very often as it tends to hurt color consistency in image generation task (I work with image inpainting task). But I have seen some strong arguments for normalization for classification tasks. Most of the normalization values are from ImageNet and work well with most of the images.
I would suggest going through some well-known classification architectures to get some pointers on input processing.
Maybe, you can start from
Thanks, I will make sure to train my training abilities x) I will check out the ImageNet norm values and try them out, see if I get better results for my nnet.