Two questions about THPVariable_to and tensor_ctor

jiangxiaobin96 · October 23, 2023, 12:47pm

I met two questions when reading source code.
First is about tensor_ctor function. As we all known, tensor_ctor run internal_new_from_data to create a new tensor. Like following source code, typeIdWithDefault and deviceOptional function will all set device_idx=2, so why here set device=2? If I want create a tensor and transfer to device 0 here, where will the code run?

auto new_tensor = internal_new_from_data(
        typeIdWithDefault(r, 2, dispatch_key), 
        r.scalartypeWithDefault(1, scalar_type),
        r.deviceOptional(2),
        data,
        /*copy_variables=*/true,
        /*copy_numpy=*/true,
        /*type_inference=*/type_inference,
        pin_memory);

Second question is also about to function. when I run tensor.to(0), function THPVariable_to will run in c++ level. And the function call chain here is THPVariable_to->dispatch_to. But here I can not find the next function after dispatch_to function.

tatic Tensor dispatch_to(const Tensor & self, Device device, bool non_blocking, bool copy, c10::optional<c10::MemoryFormat> optional_memory_format) {
  pybind11::gil_scoped_release no_gil;
  return self.to(self.options().device(device).memory_format(optional_memory_format), non_blocking, copy);
}

Because to function is a cudaMemcpyHostToDevice function, so I want to search the to fucntion from the ending to beginning. But only can find copy_ function(THPVariable_copy_). So what’s function after dispatch_to.

I just started learning the source code. Thanks for your reply.

ptrblck · October 23, 2023, 3:07pm

Could you post the code reference where the device is hard-coded in the PyTorch code base?

The copy kernel can be found here.

jiangxiaobin96 · October 24, 2023, 2:27am

Thanks for your reply. I will give more details about question.
First question, when I run torch.tensor, the function call chain is THPVariable_tensor->tensor_ctor-> internal_new_from_data.

Tensor internal_new_from_data(
    c10::TensorOptions options,
    at::ScalarType scalar_type,
    c10::optional<Device> device_opt,
    PyObject* data,
    bool copy_variables,
    bool copy_numpy,
    bool type_inference,
    bool pin_memory = false) {

The first and third input is options and device_opt which is related to typeIdWithDefault(r, 2, dispatch_key) and r.deviceOptional(2). Here we can see there are both value 2.

c10::TensorOptions typeIdWithDefault(
    PythonArgs& r,
    int64_t device_idx,
    c10::DispatchKey dispatch_key) {
  auto options = dispatchKeyToTensorOptions(dispatch_key);
  if (!r.isNone(static_cast<int>(device_idx))) {
    // TODO: This line doesn't seem to be exercised at all in tests
    options = options.device(r.device(static_cast<int>(device_idx)).type());
  }
  return options;
}

inline c10::optional<at::Device> PythonArgs::deviceOptional(int i) {
  if (!args[i])
    return c10::nullopt;
  return device(i);
}

As the source code above, device_idx=2 will be set to device(2) which will finally influence the tensor.to(device) in internal_new_from_data.

auto device = device_opt.has_value() ? *device_opt : options.device();
...
tensor = tensor.to(
        device, inferred_scalar_type, /*non_blocking=*/false, /*copy=*/false);

So why here we set device_idx=2?

About the copy kernel, I found the function call chain: THPVariable_copy_->dispatch_copy_->copy_->copy_impl->copy_stub->copy_kernel_cuda. But here start with THPVariable_copy_. How about THPVariable_to, I think it is also a function which use to copy data HTD or DTH. I just find the function call chain: THPVariable_to->dispatch_to, and what is the next function next to dispatch_to?

jiangxiaobin96 · October 24, 2023, 3:10am

THPVariable_to->dispatch_to->Tensor to(aten/src/Aten/native/TensorConversions.cpp)->to_impl->_to_copy->copy_. Finally THPVariable_to run copy_ function similar with THPVariable_copy_.

jiangxiaobin96 · October 24, 2023, 8:58am

I found tensor_ctor use default dispatch key and default_backend is DispatchKey::CPU. So does this mean creating a new tensor will store data in CPU memory first when running torch.tensor?

torch::utils::tensor_ctor(
      torch::tensors::get_default_dispatch_key(),
      torch::tensors::get_default_scalar_type(),
      r)