When I did that with an image input to Faster R-CNN, the result was None, but when I removed the multiplication, it seems to be working fine. It that a ResNet thing?
I was referring to the same lines of code as you.
Obviously the model takes inputs between 0 and 1.
That’s in contrast to VGG16 model, that takes inputs between 0 and 255
vgg16 takes in a normalized input as all classification models:
All pre-trained models expect input images normalized in the same way, i.e. mini-batches of 3-channel RGB images of shape (3 x H x W), where H and W are expected to be at least 224. The images have to be loaded in to a range of [0, 1] and then normalized using mean = [0.485, 0.456, 0.406] and std = [0.229, 0.224, 0.225] . You can use the following transform to normalize:
OK I’m at loss. I tried FCN8s, without multiplying by 255 I get all 0s, with 255 I get a good solution
fcn8s = fcn_models.FCN8s(n_class = len(pascal_object_categories))
fcn8s.load_state_dict("fcn8s_from_caffe.pth")
print(fcn8s)
#evaluate the pretrained FCN8s model on one image
def deploy_fcn_model(im):
im = PILImage.open(im)
img = np.array(im)
# these mean values are for RGB
t_ = transforms.Compose([
transforms.ToPILImage(),
transforms.ToTensor(),
transforms.Normalize(mean=[0.485, 0.457, 0.407],
std=[1,1,1])])
#multiply by 255 for the network input
img = 255*t_(img)
img.unsqueeze_(0)
if device == torch.device("cuda"):
img = img.to(device)
# get the output from the model
output = fcn8s(img)
#remove from cuda, convert to numpy, squeeze
out = output.argmax(1).squeeze_(0).detach().clone().cpu().numpy()
plt.imshow(out)
plt.show()
#load the image
bgr_img = cv2.imread("dogcat1.jpg")
# convert FCN8s pixelwise predictions to color array
color_array = np.zeros([out.shape[0], out.shape[1],3], dtype=np.uint8)
for id in np.unique(out):
print(id)
if id == 8:
color_array[out==id] = [255,0,0]
elif id == 12:
color_array[out==id] = [0,255,0]
#overlay images
added_image = cv2.addWeighted(bgr_img, 0.5, color_array,0.6, 0)
#plot
plt.imshow(added_image)
plt.show()
deploy_fcn_model("dogcat1.jpg")
No, it’s common to normalize the inputs for a lot of machine learning models, as this might accelerate and stabilize the training.
Some methods e.g. RandomForest classifiers are not sensitive to the input range, while e.g. neural networks are.
You would have to check the dataset creation (or just get a single sample) and check the range of the inputs the model was trained on.