C++ output differs from python output


(sunqin) #1

Pytorch1.0,Ubuntu16.04,Same model,Same transformation,but I got totally different result with python and cpp code.
Anyone can help me out?Thx a lot.

#========python output======#
tensor(9073.5586, device=‘cuda:0’, grad_fn=)
#========c++ output======#
count:2.185325

I put the code in Baidu Cloud
link:https://pan.baidu.com/s/1iQVndobbaSHERpIpJ0gSiw
code:sbdd


#2

My ?Chinese? knowledge to see the sources is somewhat limited :wink:


(Martin Huber) #3

Hi @sq1988,

I can’t download your files. Can you please explain your problem in more detail here?


(Martin Huber) #5

Just as a hint, you can format code by using triple quotation marks.

#include <something.h>

int main() {

    return 0;
}

Can you please additionally add the python code?


(sunqin) #6
#include <torch/script.h>
#include <torch/torch.h>
#include <opencv2/opencv.hpp>
#include <opencv2/imgproc/imgproc.hpp>
#include <opencv2/core.hpp>
#include <iostream>
#include <memory>
#include <string>
#include <vector>
 
int main(int argc, const char *argv[]) {
    torch::DeviceType device_type;
    device_type = torch::kCUDA;  //torch::kCUDA  and torch::kCPU
    torch::Device device(device_type, 0);
    
    std::shared_ptr<torch::jit::script::Module> module = torch::jit::load("model.pt");
     
    assert(module != nullptr);
    std::cout << "load model ok\n";
    module->to(device);
 
    auto image = cv::imread("284193,c89b000bf74ce6e.jpg");
    cv::Mat image_transfomed;
    cv::resize(image, image_transfomed, cv::Size(1024, 720),cv::INTER_AREA);
    cv::cvtColor(image_transfomed, image_transfomed, cv::COLOR_BGR2RGB);
 
    torch::Tensor tensor_image = torch::from_blob(image_transfomed.data, {image_transfomed.rows, image_transfomed.cols,3},torch::kByte).to(device);//hwc
    //ToTensor
    tensor_image = tensor_image.permute({ 2,0,1 });//chw
    tensor_image = tensor_image.toType(torch::kFloat);
    tensor_image = tensor_image.div(255.0);
    //normalize
    tensor_image[0] = tensor_image[0].sub_(0.485).div_(0.229);
    tensor_image[1] = tensor_image[1].sub_(0.456).div_(0.224);
    tensor_image[2] = tensor_image[2].sub_(0.406).div_(0.225);
    //add batch dimension
    tensor_image = tensor_image.unsqueeze(0);
    std::vector<torch::jit::IValue> inputs;
    inputs.emplace_back(tensor_image);
 
    auto t = (double) cv::getTickCount();
    at::Tensor output = module->forward(inputs).toTensor().cpu();
    at::IntList sizes = output.sizes();//shape
    //get result
    auto count = output[0][0].sum();
    t = (double) cv::getTickCount() - t;
    printf("count:%f,execution time = %gs\n",count,t / cv::getTickFrequency());
    inputs.pop_back();
    return 0;
}
import PIL.Image as Image
import torch
from torchvision import transforms
torch.backends.cudnn.benchmark = True
import torch
 
transform=transforms.Compose([
                       transforms.Resize((720, 1024)),
                       transforms.ToTensor(),
                       transforms.Normalize(mean=[0.485, 0.456, 0.406],std=[0.229, 0.224, 0.225])
                   ])
 
org = Image.open("284193,c89b000bf74ce6e.jpg")
img = transform(org).cuda()
traced_script_module = torch.jit.load('model.pt')
output = traced_script_module(img.unsqueeze(0))
print(output.sum())

(sunqin) #7

Any suggestion will be appreciated.


(Martin Huber) #8

what are the sizes of your output? Are they the same for python and cpp? So maybe you are not summing over all elements.


(sunqin) #10

1,1,720,1024 for c++ and python.
Can you give me your email address so I can send you the model?


(Martin Huber) #11

you could make your code publicly available at github or something similar, and I will have a look at it


(sunqin) #12

I upload the code and model in Dropbox.


(sunqin) #13

I upload the code and model in Dropbox.


(Martin Huber) #14

ye sry, I will try to check whats the difference, but there is no cuda available to me atm. Do you have the same problem, when you run the code on a cpu? May you save the script module with device type cpu?