I would like to subtract the mean pixel from my loaded image:
import torch
from PIL import Image
import torchvision
import torchvision.transforms as transforms
image = Image.open(image_name)
loader = transforms.Compose([transforms.Scale(image_size), transforms.ToTensor()]) # resize and convert to tensor
image = loader(image)
I assume I can use: transforms.Normalize(mean, std)
like this?
transforms.Normalize(mean = [103.939, 116.779, 123.68])
I am trying to replicate the image processing steps from this code: https://github.com/jcjohnson/neural-style/blob/master/neural_style.lua#L416-L437
transforms.Normalize
is applied on torch.Tensor
s.
Usually you would have something like this:
transform=transforms.Compose([
transforms.ToPILImage(),
transforms.Resize(250),
transforms.ToTensor(),
transforms.Normalize(
mean = [0.485, 0.456, 0.406],
std = [0.229, 0.224, 0.225])
]
)
Since ToTensor()
scales the Tensor
to [0, 1]
, you should use “normalized” mean and stddev values.
@ptrblck Is there a way to make ToTensor()
scale to 0-255
instead of 0-1
? Or would I manually have to convert to a [0, 255] Tensor?
Edit: I think this might work
def preprocess(img):
mean_pixel = torch.DoubleTensor([103.939, 116.779, 123.68])
img = torch.FloatTensor(img)
mean_pixel_image = torch.Tensor()
mean_pixel_image.resize_as_(img).copy_(mean_pixel)
mean_pixel_image = mean_pixel_image.float()
img = img - mean_pixel_image
return img
image = spi.imread(params.image, mode="RGB").astype(float)
image = imresize(image, params.image_size, interp='bilinear')
image = preprocess(image)
But that results in this when trying to run my net forward with the image:
net.updateOutput(image)
File "/usr/local/lib/python2.7/dist-packages/torch/legacy/nn/Sequential.py", line 36, in updateOutput
currentOutput = module.updateOutput(currentOutput)
File "/usr/local/lib/python2.7/dist-packages/torch/legacy/nn/SpatialConvolution.py", line 84, in updateOutput
self._viewWeight()
File "/usr/local/lib/python2.7/dist-packages/torch/legacy/nn/SpatialConvolution.py", line 75, in _viewWeight
self.gradWeight = self.gradWeight.view(self.nOutputPlane, self.nInputPlane * self.kH * self.kW)
RuntimeError: invalid argument 2: size '[64 x 27]' is invalid for input with 0 elements at /home/ubuntu/pytorch/aten/src/TH/THStorage.c:41
Your code looks ok. You don’t need to resize the mean_pixel
, since the Tensor
could be broadcasted.
image_tensor = torch.from_numpy(image.astype(np.float32))
image_tensor = image_tensor - mean_pixel.view(1, 1, -1)
image_tensor = image_tensor.permute(2, 0, 1)
I assume spi.imread
is scipy’s imread
function, so that image
will have [w, h, c]
.
You have to permute your Tensor
so that the channel dimension will be at dimension 0
and feed it to your model.