About pytorch's memory management

    auto options = torch::TensorOptions()
                           .dtype(at::kFloat)
                           .device(torch::kCUDA, current_device);
    auto res = torch::empty(res_shape, options);

When I create a CUDA tensor in C++, does this automatically use pytorch’s CudaCachingAllocator?

Yes, creating tensors on the GPU will use the caching allocator.