What is the underlying implementation of torch.tensor.to?

I’m a freshman of PyTorch and C++, I can easily use ‘torch.tensor.to’ function to transfer model between CPU and GPU. But I want to understand what is the underlying implementation of torch.tensor.to, especially how to malloc memory and copy memory.

I can find def to() in source code ‘torch/nn/modules/module.py’, but I don’t know how to interact with C++ and CUDA source code underlying. Where is this part source code?

Can anyone help to explain this question? Thanks

In case you are looking for the CUDA memory management take a look at these docs and if you are interested about the internal implementation of the CUDACachingAllocator check the code here.

Thanks. I’ll follow your suggestion to learn.