I have a CUDA variable that is part of a differentiable computational graph. I want to read out its value into numpy (say for plotting).
If I do
var.numpy() I get
RuntimeError: Can’t call numpy() on Variable that requires grad. Use var.detach().numpy() instead.
Ok, so I do
var.detach().numpy() and get
TypeError: can’t convert CUDA tensor to numpy. Use Tensor.cpu() to copy the tensor to host memory first
Ok, so I go
var.detach().cpu().numpy() and it works.
My question is: Is there any good reason why this isn’t just done within the
numpy() method itself? It’s cumbersome and litters the code to have all these
*.detach().cpu().numpy()'s sitting all around.