Same preprocess got different result on python and C++

Hello, communities!

I got a preprocess script:

class BaseTransform(object):
    """Defines the transformations that should be applied to test PIL image
        for input into the network

    dimension -> tensorize -> color adj

    Arguments:
        resize (int): input dimension to SSD
        rgb_means ((int,int,int)): average RGB of the dataset
            (104,117,123)
        swap ((int,int,int)): final order of channels
    Returns:
        transform (transform) : callable transform to be applied to test/val
        data
    """
    def __init__(self, resize, rgb_means, swap=(2, 0, 1)):
        self.means = rgb_means
        self.resize = resize
        self.swap = swap

    # assume input is cv2 img for now
    def __call__(self, img):

        interp_methods = [cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_NEAREST, cv2.INTER_LANCZOS4]
        interp_method = interp_methods[0]
        img = cv2.resize(np.array(img), (self.resize,
                                         self.resize),interpolation = interp_method).astype(np.float32)
        img -= self.means
        img = img.transpose(self.swap)
        return torch.from_numpy(img)

This will preprocess from a cv2.imread() image and return a tensor which input to a model.

And I did same thing in C++:

void detect() {
cv::Mat in_img;
  preprocess(in_img, frame);

  at::Tensor tensor_image = torch::from_blob(in_img.data, {1, in_img.rows, in_img.cols, 3}, torch::kByte);
  tensor_image = tensor_image.permute({0, 3, 1, 2}).to(torch::kFloat32).to(device_type);
  cout << "tensor_image: " << tensor_image << endl;
}

// method
void Detector::preprocess(cv::Mat &image, const cv::Mat &input_image) {
  cv::resize(input_image, image, cv::Size(image_size_, image_size_));
  cv::subtract(image, cv::Scalar(103.94, 116.78, 123.68), image);
}

As you can see, the operation is exactly same!!! But I got really different result!!!Same model, same image input:

Does anybody could help out ??? I am stucked here, cause I can not see what’s the wrong with my C++ preprocess…

If you subtract the mean in your C++ approach, is image transformed to a float32 image or is it still a uint8 image and thus clipped to [0, 255]?

As far as I know, OpenCV loads the image in BGR, so you might want to permute the mean vector (since your comment says it’s in RGB).

thanks!!! it worked!!!