C++ output differs from python output

sq1988 · March 28, 2019, 5:25am

Pytorch1.0,Ubuntu16.04,Same model,Same transformation,but I got totally different result with python and cpp code.
Anyone can help me out?Thx a lot.

#========python output======#
tensor(9073.5586, device=‘cuda:0’, grad_fn=)
#========c++ output======#
count:2.185325

I put the code in Baidu Cloud
link：https://pan.baidu.com/s/1iQVndobbaSHERpIpJ0gSiw
code：sbdd

arlecchino · March 30, 2019, 7:52pm

My ?Chinese? knowledge to see the sources is somewhat limited

mhubii · April 1, 2019, 7:45am

Hi @sq1988,

I can’t download your files. Can you please explain your problem in more detail here?

mhubii · April 1, 2019, 8:31am

Just as a hint, you can format code by using triple quotation marks.

#include <something.h>

int main() {

    return 0;
}

Can you please additionally add the python code?

sq1988 · April 1, 2019, 10:06am

#include <torch/script.h>
#include <torch/torch.h>
#include <opencv2/opencv.hpp>
#include <opencv2/imgproc/imgproc.hpp>
#include <opencv2/core.hpp>
#include <iostream>
#include <memory>
#include <string>
#include <vector>
 
int main(int argc, const char *argv[]) {
    torch::DeviceType device_type;
    device_type = torch::kCUDA;  //torch::kCUDA  and torch::kCPU
    torch::Device device(device_type, 0);
    
    std::shared_ptr<torch::jit::script::Module> module = torch::jit::load("model.pt");
     
    assert(module != nullptr);
    std::cout << "load model ok\n";
    module->to(device);
 
    auto image = cv::imread("284193,c89b000bf74ce6e.jpg");
    cv::Mat image_transfomed;
    cv::resize(image, image_transfomed, cv::Size(1024, 720),cv::INTER_AREA);
    cv::cvtColor(image_transfomed, image_transfomed, cv::COLOR_BGR2RGB);
 
    torch::Tensor tensor_image = torch::from_blob(image_transfomed.data, {image_transfomed.rows, image_transfomed.cols,3},torch::kByte).to(device);//hwc
    //ToTensor
    tensor_image = tensor_image.permute({ 2,0,1 });//chw
    tensor_image = tensor_image.toType(torch::kFloat);
    tensor_image = tensor_image.div(255.0);
    //normalize
    tensor_image[0] = tensor_image[0].sub_(0.485).div_(0.229);
    tensor_image[1] = tensor_image[1].sub_(0.456).div_(0.224);
    tensor_image[2] = tensor_image[2].sub_(0.406).div_(0.225);
    //add batch dimension
    tensor_image = tensor_image.unsqueeze(0);
    std::vector<torch::jit::IValue> inputs;
    inputs.emplace_back(tensor_image);
 
    auto t = (double) cv::getTickCount();
    at::Tensor output = module->forward(inputs).toTensor().cpu();
    at::IntList sizes = output.sizes();//shape
    //get result
    auto count = output[0][0].sum();
    t = (double) cv::getTickCount() - t;
    printf("count:%f,execution time = %gs\n",count,t / cv::getTickFrequency());
    inputs.pop_back();
    return 0;
}

import PIL.Image as Image
import torch
from torchvision import transforms
torch.backends.cudnn.benchmark = True
import torch
 
transform=transforms.Compose([
                       transforms.Resize((720, 1024)),
                       transforms.ToTensor(),
                       transforms.Normalize(mean=[0.485, 0.456, 0.406],std=[0.229, 0.224, 0.225])
                   ])
 
org = Image.open("284193,c89b000bf74ce6e.jpg")
img = transform(org).cuda()
traced_script_module = torch.jit.load('model.pt')
output = traced_script_module(img.unsqueeze(0))
print(output.sum())

sq1988 · April 1, 2019, 10:10am

Any suggestion will be appreciated.

mhubii · April 1, 2019, 10:42am

what are the sizes of your output? Are they the same for python and cpp? So maybe you are not summing over all elements.

sq1988 · April 2, 2019, 1:41am

1,1,720,1024 for c++ and python.
Can you give me your email address so I can send you the model?

mhubii · April 2, 2019, 7:03am

you could make your code publicly available at github or something similar, and I will have a look at it

sq1988 · April 2, 2019, 7:30am

I upload the code and model in Dropbox.

sq1988 · April 4, 2019, 2:11am

I upload the code and model in Dropbox.

mhubii · April 4, 2019, 9:58pm

ye sry, I will try to check whats the difference, but there is no cuda available to me atm. Do you have the same problem, when you run the code on a cpu? May you save the script module with device type cpu?

zhenglei0102 · March 27, 2020, 8:43am

Did you use batch norm in your code, if so ,you should call “module->eval()” before calling “module->forward()”