Terminate called after throwing an instance of 'c10::IndexError' what(): index 1701270418 is out of bounds for dimension 0 with size 500

Hi,

I am to trying to train a model in which I am updating only some indices of a PyTorch tensor with the evaluations of a network. For this, I am using the

output_tensor.index_put_({torch::from_blob(indices.data(), {indices.size()}, torch::kInt32)}, output_tensor.index({torch::from_blob(indices.data(), {indices].size()}, torch::kInt32)}) + networks->forward(points_tensor));

Where output_tensor is a tensor with some shape Nx1, indices is a std::vector<int> with the indices of the output_tensor I want to add the network prediction to. I keep getting this error:

terminate called after throwing an instance of 'c10::IndexError' what(): index 1701270418 is out of bounds for dimension 0 with size 500

Where the really large integer index out of bound changes to a different large integer every time I run the code. Can anyone help me debug this?

Thank you.

Hi @ptrblck. Do you know what might be causing this error?

Thank you.

The underlying data is most likely freed and the tensor thus contains garbage. .clone() the tensor before indices is deleted to make sure the values are valid.

Thank you so much for your reply @ptrblck. I tried cloning the tensor but the error persists. This is what I changed the code to:

output_tensor.index_put_({torch::from_blob(indices.data(), {indices.size()}, torch::kInt32).clone()}, output_tensor.index({torch::from_blob(indices.data(), {indices].size()}, torch::kInt32).clone()}) + networks->forward(points_tensor));

It is worth mentioning that the forward pass of this code goes perfectly smoothly. Only the backward pass has a problem. Could it be that the in place index_put_ operation is not backward compatible? In which case I should do something like this,

output_tensor.index(torch::from_blob(indices.data(), {indices.size()}, torch::kInt32)) = output_tensor.index(torch::from_blob(indices.data(), {indices.size()}, torch::kInt32)) + network->forward(points_tensor);

“Backward compatible” refers to software versioning, but I assume you are asking about the support of the backward pass? If so, index_put_ won’t break the backward pass and the error still sounds as if a tensor/data is going out of scope. You could write a quick check in Python, which should work also in the backward pass.

Ah yes, sorry, I meant as in the backward pass. I will try writing a quick check in Python. Thank you.