Hi, I trained and traced a Model in Python. Now I want to predict outpouts with c++. I read the tutorial code and was fine with it. the problem was the batch size of one. To fix this, I created a cutsom dataset and made data loader. How do I push the data from the dataloader to the network? I tried it as follows, but run into errors.
int main() {
string modelPath = "C:\\..\\myModel.pt";
shared_ptr<torch::jit::script::Module> module = torch::jit::load(modelPath);
module->eval();
int batch_size = 10;
auto dataset = CustomDataset("C:\\..\\labels.csv").map(torch::data::transforms::Stack<>());
auto data_loader = torch::data::make_data_loader(std::move(dataset), torch::data::DataLoaderOptions().batch_size(batch_size).workers(1))
//Input Values
std::vector<torch::jit::IValue> input;
for (torch::data::Example<>& batch : *data_loader) {
input.push_back(batch.data);
torch::Tensor outputs = module->forward(input).toTensor();
input.clear();
cout << outputs;
//Execute model
//at::Tensor output = module->forward(inputs).toTensor();
//std::cout << output.slice(/*dim=*/1, /*start=*/0, /*end=*/5) << '\n';
};
};
Hi Martin, I am a german developer and get the error âAusnahme ausgelöst bei 0x00007FFEDFEC127E (vcruntime140d.dll) in example-app.exe: 0xC0000005: Zugriffsverletzung beim Lesen an Position 0x000001C42A0C0000.â This Message shows up and at the same time visual studio opens the header file âFunctions.hâ. It stops the execution at line 2558 at
I searched on the internet and it may be some kind of Null pointer exception but I can be wrong (not that much experience). Another confusing thing is that the error even occurs when I comment out the code in the for loop (not the for loop itself). I used the vs debugger to get some information about the problem but there were literally more than 100 values related to the variable tensor and the trace stack shows more than 15 methods that have been called. I can say that the images are transfered to the tensors correctly. I proofed this with the debugger by giving every line a stoping point.
Believe it or not but sometimes it works and sometimes not (but mostly not).The image values have been converted wrong by me and they cause some weird outputs but this is just a side fact and may not be related to the problem.
Thanks for your answer. Letâs solve this together
Sounds like there is something off with the custom dataset. Please post the dataset and the csv file from which you read in the locations. Maybe try to replace the relative paths by absolute paths.
Hi Martin, thank you for your reply. It was great and helped a lot. It seems like there is no wrong bahaviour at this time. May you please explain, why this helped? The actual output of my batches ist in the range of 0 and 255 even though I normalized the data above. It worked fine with the at::kFloat Option but I only processed one picture in a previous version without Dataloader or anything else.
ah cool, did it also solve the initial problem? The thing here is that torch::form_blob directly reads out the memory. Your image probably was of type byte, which equals 8 bits per pixel. By calling at::kFloat, torch::form_blob moved 32 bits per pixel instead of 8 bits per pixel while reading the memory. This most likely led to âZugriffsverletzung beim Lesen an Positionâ, as it tried to access memory which was not allocated.
Hi Martin, I still have problems solving this overall problem. First I do not normalize the data anymore because I did not do that in my training routine in Python. When I print out the Mat object, I get the values I expected - between 0 and 255. After thi I printed ia Mat::data and there were only cryptical signs. Then I load the data into the tensor object as the code shows it. In the loop I print the batch.data and it shows the number 221 for each pixel value wich is very confusing for me. Do you have a clue, why it behaves like it does?