How do I change a Torchvision model’s input range from 0-1 to 0-256?
I see here:
The models expect a list of
Tensor[C, H, W] , in the range
I’d like to be able to put an image into one of the default VGG models from Torchvision that has a range of 0-256.
I want the image to remain in the 0-255 range when it’s put through the network/model. So I want to know how I change the model to support this?
The conversion is done by the toTensor function of
So, if you remove it from your pre-processing pipeline, your image will be in 0-255 range.
But, if you omit it, you need to convert to Tensor yourself.
I should have worded the question better, I want to be able to use this code as PIL/Pillow loads an image with with a 0-1 range:
image = Image.open(image_name).convert('RGB')
Loader = transforms.Compose([transforms.Resize(image_size), transforms.ToTensor()])
Normalize = transforms.Compose([transforms.Normalize(mean=[123.68, 116.779, 103.939], std=[1,1,1])])
tensor = Normalize(Loader(image) * 256).unsqueeze(0)
I want the image to remain in the 0-255 range when it’s put through the network.
You should remove
transforms.ToTensor(), which scales the image to 0-1 (see the docs).
So, there’s no issue with using a model trained on images of range 0-1, on images that have a range of 0-255? Because in testing, it looks like something is off.
Of course there is. The image pixel values should be in the same range during training and testing.
Can I change the model after training so that the it accepts different pixel values?
Why doesn’t this work?
from collections import OrderedDict
model = torch.load('vgg-16.pth')
new_state_dict = OrderedDict()
for (k, v) in model.items():
t = torch.ones(v.size()).float() *255
v = v.mul(t.cuda())
new_state_dict[k] = v
If the model is trained with 0-1 input, it should be tested with 0-1 input (and with the same mean/std normalization), and the same for other input ranges. It will not work otherwise.
You cannot simply change the change of the input and multiply each parameter with the same factor to get the same results.
If you want to use image tensors in the range
[0, 255], then you would need to retrain the model using this data.
I would have converted the model output to
[0, 255] instead. However, you’ll have to make sure that your targets / labels are also in the same range. There is no advantage of using an input with [0, 255] over [0, 1], as the model weight parameters will adapt to whatever input range, it is a scaling factor as you can see. However, the common practice and the best performance can be obtained when the input is normalized to [0, 1] or [-1, 1].