How does the placement new work for the cuda allocator?

ayl · March 22, 2019, 2:27am

Recently, When I study the source code of c10/core/TensorImpl.h, I am stuck with the following code. If I use a cuda allocator that using cudaMalloc to allocate memory and the mata.placementNew() is not null for our typed data, then how does the placementNew() work for the device pointer returned by the cudaMalloc since it points to the device memory? Does this code just work for the host allocator and host memory? Or does this code just work for the fundamental types for cuda device without never taking the if(true) path?

if (meta.placementNew()) {
  // For types that need placement new, we will call it, as well as
  // making sure that when the data is freed, it calls the right
  // destruction procedure.
  auto size = numel_;
  auto dtor = data_type_.placementDelete();
  auto data_ptr = allocator->allocate(numel_ * storage_.itemsize());
  storage_.set_data_ptr(PlacementDeleteContext::makeDataPtr(
        std::move(data_ptr), dtor, size, storage_.device()));
        data_type_.placementNew()(storage_.data(), numel_);
} else {
  // For fundamental type, new and delete is easier.
  storage_.set_data_ptr(
        allocator->allocate(numel_ * storage_.itemsize()));
}

yf225 · March 22, 2019, 7:42pm

cc. @ezyang do you know the right answer to this?

ezyang · March 24, 2019, 6:42pm

You can’t placement-new for CUDA memory. That feature only works on CPU.

ayl · March 25, 2019, 12:57am

Thanks. According to your explanation, the allocator used here can only be the CPU allocators or these allocators that using the unified memory such as cudaHostAlloc or cudaMallocManaged. Is my understanding right?