Inference with alexnet and cv::imread image

I am trying to infer with a C++ application an image classification task using an alexnet pre-trained net. I have successfully inferred a dog image loading the net with python:

alexnet = torchvision.models.alexnet(pretrained=True)
img ="dog.jpg")
transform = transforms.Compose([
 mean=[0.485, 0.456, 0.406],         
 std=[0.229, 0.224, 0.225]              
img_t = transform(img)
batch_t = torch.unsqueeze(img_t, 0)
_, index = torch.max(out, 1)

Index is 208, Labrador_retriever, that looks good.
Then I save the net to be loaded from a c++ application

example = torch.rand(1, 3, 224, 224)
traced_script_module_alex = torch.jit.trace(alexnet, example)"")

When I load to C++, I get the wrong result:

cv::Mat img = cv::imread("dog.jpg");
cv::resize(img, img, cv::Size(224, 224), cv::INTER_CUBIC);

// Convert the image and label to a tensor.
torch::Tensor img_tensor = torch::from_blob(, { 1, img.rows, img.cols, 3 }, torch::kByte);
img_tensor = img_tensor.permute({ 0, 3, 1, 2 }); // convert to CxHxW
img_tensor =;
std::vector<torch::jit::IValue> input;
torch::jit::script::Module  module = torch::jit::load("");
at::Tensor output = module.forward(input).toTensor();
std::cout << output.argmax(1) << '\n';

the max is 463, bucket.
I think the normalization is bad: how I can normalize the image to get the same 0…1 range as the transform does in the python version? Dividing by 255 gives the same result…

ToTensor returns a tensor with values in the range [0, 1], while transforms.Normalize will subtract the mean and divide by the standard deviation.
Note that OpenCV will read the image channels as BGR, while PIL uses RGB, so you might need to convert it additionally.

Shame on me :hot_face:: I saved traced_script_module (a previous trained resnet18 model in the same notebook) instead of traced_script_module_alex as Now it works…!