Hello,
I am currently implementing my own CUDA extension for Pytorch. Now I am wondering whether I should avoid manually allocating memory on the GPU and instead use ATen for this, so Pytorch’s memory manager can effectively handle allocation?
I also appreciate if you can show me where to read up in Pytorch internals such as the memory management!
Best regards,
Tim