How does index_put_/copy work under the hood?

F10w10 · July 26, 2021, 2:51pm

I am following the index_put_ operation through the code, as it would be applied in this case:

using namespace torch::indexing;
auto tOne = torch::ones(new long[] {10, 100, 100, 100});
auto tZero = torch::zeros(new long[] {10, 100, 100, 100});
tZero.index_put_({ "...", Slice(0, 50), Slice(0, 50), Slice(0, 50) },  tOne .index({ "...", 50, 50, 50 }));

I find it implemented in aten/src/ATen/TensorIndexing.cpp:

Tensor & Tensor::index_put_(ArrayRef<at::indexing::TensorIndex> indices, Tensor const & rhs) {
  TORCH_CHECK(indices.size() > 0, "Passing an empty index list to Tensor::index_put_() is not valid syntax");
  OptionalDeviceGuard device_guard(device_of(*this));
  at::indexing::set_item(*this, indices, rhs);
  return *this;
}

Which uses set_item, defined in aten/src/ATen/TensorIndexing.h

If I see it correctly, it applies the different strides, i.e. creating new views of the existing tensor by fidling with stride etc, but not actually fiddling with the actual storage underneath:

Tensor sliced = impl::applySlicing(self, indices, tensorIndices, disable_slice_optimization, self_device, self_sizes);

And then a copy_to is performed:

  if (tensorIndices.empty()) {
    copy_to(sliced, value);
    return;
  }

I tried to follow that but could not follow allong. I am wondering how it eventually works that we have a view of a tensor, i.e. it is not continuous in memory and we then perform a copy operation. Does it boil down to element wise copy or are there further optimizations involved? Where is it implemented?