I would like to use network in C++ by building tensors and operations of ATen using GPU, but it seems to be impossible to free GPU memory of tensors automatically. Is there any way to use garbage collector or some thing like it supported by ATen? Used platform are Windows 10, CUDA 8.0, CUDNN 7, Pytorch 0.4.0.
I found that ATen library provides automatically releasing memory of a tensor when its reference count becomes 0 if the tensor resides in CPU memory. In below code, I could find that CPU memory of the tensor is freed at (3*).
/* example 1 */
// cpu memory check (1*)
{
auto tensor = at::CPU(at::kFloat).ones({1000*1000*400});
// cpu memory chec (2*)
}
// cpu memory check (3*)
However, I found that CUDA tensor is not released (3**) in example 2 and that it is released only after the end of the program. I checked GPU memory at every (**) point with MSI Afterburner monitoring program.
/* example 2 */
// cpu memory check (1**)
{
auto tensor = at::CUDA(at::kFloat).ones({1000*1000*400});
// cuda memory chec (2**)
}
// cuda memory check (3**)
It found that CUDA tensor of ATen can be freed as following although I do not know that it is intended way and recommanded method to release memory. However, it seems that we can hardly use cudaFree to every single tensor output of ATen function such as at::conv2d
auto tensor = at::CUDA(at::kFloat).ones({1000*1000*400});
cudaFree(tensor.storage()->data()); // ok. gpu memory is freed
...
out = at::conv2d(out, ...); // how to delete this intermediate tensor without (auto out2 = at::conv2d(out);)?
out = at::conv2d(out, ...);
out = at::conv2d(out, ...);