How to push data with Dataloader ni LibTorch

RuFAI · August 6, 2019, 11:31am

Hi, I trained and traced a Model in Python. Now I want to predict outpouts with c++. I read the tutorial code and was fine with it. the problem was the batch size of one. To fix this, I created a cutsom dataset and made data loader. How do I push the data from the dataloader to the network? I tried it as follows, but run into errors.

int main() {
	string modelPath = "C:\\..\\myModel.pt";
	shared_ptr<torch::jit::script::Module> module = torch::jit::load(modelPath);
	module->eval();

	int batch_size = 10;

	auto dataset = CustomDataset("C:\\..\\labels.csv").map(torch::data::transforms::Stack<>()); 
	auto data_loader = torch::data::make_data_loader(std::move(dataset), torch::data::DataLoaderOptions().batch_size(batch_size).workers(1))

	//Input Values
	std::vector<torch::jit::IValue> input;
for (torch::data::Example<>& batch : *data_loader) {
		
		input.push_back(batch.data);
		torch::Tensor outputs = module->forward(input).toTensor();
		input.clear();
		cout << outputs;

		//Execute model
		//at::Tensor output = module->forward(inputs).toTensor();
		//std::cout << output.slice(/*dim=*/1, /*start=*/0, /*end=*/5) << '\n';
	};
};

mhubii · August 7, 2019, 1:06pm

The code looks fine. What does the error read?

RuFAI · August 7, 2019, 1:46pm

Hi Martin, I am a german developer and get the error “Ausnahme ausgelöst bei 0x00007FFEDFEC127E (vcruntime140d.dll) in example-app.exe: 0xC0000005: Zugriffsverletzung beim Lesen an Position 0x000001C42A0C0000.” This Message shows up and at the same time visual studio opens the header file ‘Functions.h’. It stops the execution at line 2558 at

static inline Tensor _th_cat(TensorList tensors, int64_t dim) {
        return detail::infer_type(tensors)._th_cat(tensors, dim);
}

I searched on the internet and it may be some kind of Null pointer exception but I can be wrong (not that much experience). Another confusing thing is that the error even occurs when I comment out the code in the for loop (not the for loop itself). I used the vs debugger to get some information about the problem but there were literally more than 100 values related to the variable tensor and the trace stack shows more than 15 methods that have been called. I can say that the images are transfered to the tensors correctly. I proofed this with the debugger by giving every line a stoping point.

Believe it or not but sometimes it works and sometimes not (but mostly not).The image values have been converted wrong by me and they cause some weird outputs but this is just a side fact and may not be related to the problem.

Thanks for your answer. Let’s solve this together

mhubii · August 7, 2019, 2:27pm

Sounds like there is something off with the custom dataset. Please post the dataset and the csv file from which you read in the locations. Maybe try to replace the relative paths by absolute paths.

RuFAI · August 8, 2019, 7:02am

Thank you for your reply, Martin. The code below is the one for my CustomDataset class.

What happens in the read_data() method?

It reads the text file wich is structured as follows: [Path of one particular image];[label]
After reading the line to the seperator, it stores the image name in the first place of the tuple and then the label.
This is repeated until the file has no further entries

Alle the paths are absolut, both image paths and the path to the label.csv file

class CustomDataset : public torch::data::Dataset<CustomDataset> {

private:

	std::vector<tuple<string,int>> data;// finale Daten, nicht Typ string

public:
	CustomDataset(string img_names_src) {

		 read_data(img_names_src);
	}

	//Method to read the image paths and labels int the vector data
	void read_data(string img_names_src) {

		//Stream to text file
		ifstream name_stream;
		name_stream.open(img_names_src);
		if (!name_stream.good()) {
			std::cout << "Could not open file " << img_names_src << endl;
		}

		//load file names and labels to ram
		int* temp_idx = 0;
		while (name_stream.good()) {
			string img_name;
			string label;
			int int_label;
			getline(name_stream, img_name, ';');
			getline(name_stream, label, '\n');
			int_label = std::stoi(label);
			data.push_back(make_tuple(img_name, int_label));
			temp_idx++;
		}

	std:cout << data.size() << endl;
	}

torch::data::Example<> get(size_t index) override {

		
		//store image in Mat object
		Mat in_img = imread(std::get<0>(data[index]));
		Mat img;
		
		
		//OpenCV data manipulation
		cvtColor(in_img, img, CV_BGR2GRAY);
		
                //normalize to 0.0 and 1.0
		img = img / 255.0; 
		
		Size size(128, 128);
		resize(img, img, size);// set size to 128x128

                //load OpenCV image data to torch::tensor format
		torch::Tensor tensor_image = torch::from_blob(img.data, { img.rows, img.cols,1}, at::kFloat);
                
                //permute dimensions of tensor to Libtorch format
		tensor_image = tensor_image.permute({ 2, 0, 1 });
                
                //load label into torch::Tensor format
		torch::Tensor label_tensor = torch::full({ 1 }, std::get<1>(data[index]));
		
                //return image and corresponding label as tensor data
		return { tensor_image, label_tensor };

	}

	torch::optional<size_t> size() const override {

		return data.size();
	};

};

mhubii · August 8, 2019, 8:53am

the code looks fine as well, it must be something else. Can you upload an example to github or some other place, so that I can check it somehow?

mhubii · August 8, 2019, 11:28am

The problem with your images may be resolved by changing the way you convert the cv::Mat to a torch::Tensor, e.g.

// Replace this
torch::Tensor tensor_image = torch::from_blob(img.data, { img.rows, img.cols,1}, at::kFloat);
// By this
torch::Tensor tensor_image = torch::from_blob(img.data, { img.rows, img.cols,1}, at::kByte);

Unless your images are somehow stored as float

RuFAI · August 9, 2019, 8:35am

Hi Martin, thank you for your reply. It was great and helped a lot. It seems like there is no wrong bahaviour at this time. May you please explain, why this helped? The actual output of my batches ist in the range of 0 and 255 even though I normalized the data above. It worked fine with the at::kFloat Option but I only processed one picture in a previous version without Dataloader or anything else.

Thank you so much

mhubii · August 9, 2019, 1:03pm

ah cool, did it also solve the initial problem? The thing here is that torch::form_blob directly reads out the memory. Your image probably was of type byte, which equals 8 bits per pixel. By calling at::kFloat, torch::form_blob moved 32 bits per pixel instead of 8 bits per pixel while reading the memory. This most likely led to ‘Zugriffsverletzung beim Lesen an Position’, as it tried to access memory which was not allocated.

RuFAI · August 12, 2019, 8:40am

Hi Martin, I still have problems solving this overall problem. First I do not normalize the data anymore because I did not do that in my training routine in Python. When I print out the Mat object, I get the values I expected - between 0 and 255. After thi I printed ia Mat::data and there were only cryptical signs. Then I load the data into the tensor object as the code shows it. In the loop I print the batch.data and it shows the number 221 for each pixel value wich is very confusing for me. Do you have a clue, why it behaves like it does?

RuFAI · August 12, 2019, 8:51am

Solved this on by transoforming the values to float. I get the desired output foramt, but my Output is n ot as expected. I will keep you informed lol

mhubii · August 22, 2019, 4:43pm

could you solve the problem @RuFAI ? Sry I have been busy recently