Tensor of undefined size

(Afshin Oroojlooy) #1

In C++ we can declare a vector and then add undefined number of elements to that by push_back function, e.g.,:

std::vector<float> v;
for (int i=0; i< n; i++)
  v.push_back(i);

in which n can be any number. In other words, in the declaration time, I do not need to instantiate std::vector with a known number of elements.

In torch, we I create a tensor, I could not find any torch::tensor that does not need to define the number of elements, and also could not find functions like push_back. Is there such a functionality in torch?

#2

You can always concatenate a tensor with another for as long as you wish, I’ve seen this a lot as an equivalent of push_back in many PyTorch implementations.

1 Like
(Afshin Oroojlooy) #3

Thanks, it is a good idea.

(Afshin Oroojlooy) #4

I implemented this idea, and it turned out that it is very slow.
When I know the size of the tensor and create it with torch::zeros() and fill it in a loop, it is way faster than using torch::cat.
Is there any other way to handle this?

#5

It depends on your application. But say that you have the data you want to pass to your tensor in a array, say int x[4] = {1, 2, 3, 4}, then you can create a tensor by putting x as argument.

#include <torch/torch.h>
#include <iostream>

int main(){
	int x[4] = {1, 2, 3, 4};
	torch::Tensor tensor = torch::tensor(x);
	std::cout<< tensor << std::endl;
	return 0;
}

It works for std::vector too:

#include <torch/torch.h>
#include <iostream>
#include <vector>

int main(){
	std::vector<int> x = {1, 2, 3, 4};
	torch::Tensor tensor = torch::tensor(x);
	std::cout<< tensor << std::endl;
	return 0;
}

And finally, it works for pointers too:

#include <torch/torch.h>
#include <iostream>

int main(){
	int *x = (int *) malloc(4 * sizeof(int));
	for(int i = 0; i < 4; i++){
		x[i] = i;
	}
	torch::Tensor tensor = torch::zeros({4}).to(torch::kInt);
	std::memcpy(tensor.data<int>(), x, 4 * sizeof(int));
	std::cout<< tensor << std::endl;
	free(x);
	return 0;
}
(Afshin Oroojlooy) #6

Thanks for the quick response.
You are right, when the input dimension is an array or 1D vector, there is a overloaded version of torch::tensor that easily creates a tensor from them. But, the problem is that there is not any version of torch::tensor for getting a 2D, 3D, or 4D vector as the input. If you try something like:

std::vector<std::vector<int>> input_2d;
for (int i = 0 ; i <5; i++)
    input_2d.push_back(input_1d);

torch::Tensor state_from_vector_2d = torch::tensor(input_2d);

you get No matching function for call to 'tensor' error.
Any suggestion?

#7

I don’t know any super optimized way of doing this. But this could be useful:

#include <torch/torch.h>
#include <iostream>

int main(){
	std::vector<std::vector<int>> x = {{1, 2}, {3, 4}};
	torch::Tensor tensor = torch::zeros({2, 2}).to(torch::kInt);
	int n = x.size();
	for(int i = 0; i < n; i++){
		std::memcpy(tensor.data<int>() + i * n, &x[i][0], 2 * sizeof(int));
	}
	std::cout<< tensor << std::endl;
	return 0;
}

There, you can kind of save a nested for loop, since std::memcpy is very fast.

(Afshin Oroojlooy) #8

Yes, this approach works, but you have to know the number of your observations (rows), as you have defined it by torch::Tensor tensor = torch::zeros({2, 2}).to(torch::kInt);.
The initial question was looking for a solution when we do not know the number of rows.

#9

If you don’t have any information on how your tensor will look like at compile time, I don’t know how you can highly optimize this operation. If it becomes a bottleneck, I guess it could be useful to factorize some parts of your code, so that you could potentially have some prior information on the inputs.