Torch.from_numpy() in custom backward() doesn't work

purrfegt · November 11, 2017, 12:39am

When trying to implement a custom autograd.Function, I’m calling torch.from_numpy(x) inside the backward() definition of the custom function. However, this doesn’t work. Replacing torch.from_numpy(x) with torch.zeros(…) works. Copying over all values one-by-one from numpy array ‘x’ to the fresh torch.zeros(…) tensor works. What am I doing wrong? Also, the example for custom modules from numpy and scipy doesn’t work.

richard · November 12, 2017, 11:02pm

Could you provide details about how it doesn’t work (an error message), and could you provide a minimal code example that causes it?

purrfegt · November 14, 2017, 3:53pm

There is nor error message, it just says “Segmentation fault (core dumped)”.
My own code tried to call torch.from_numpy(…) on a numpy array (in the definition of backward() of my custom function), instead of the torch.FloatTensor(…) in the documentation example for custom parameterized modules from numpy/scipy.
I also tried running the example in http://pytorch.org/tutorials/advanced/numpy_extensions_tutorial.html#parametrized-example and that gave the same segfault.

richard · November 14, 2017, 4:09pm

What version of pytorch are you using? (You can check with torch.__version__).

I ran the example on pytorch 0.2 and master and couldn’t reproduce the segfault.

purrfegt · November 14, 2017, 4:13pm

PyTorch version: 0.2.0_2
Numpy version: 1.13.1

Tried that example again, still giving segfault.

richard · November 14, 2017, 4:20pm

Could you try upgrading your pytorch and seeing if that helps? I’m running 0.2.0_4.

purrfegt · November 14, 2017, 4:30pm

Upgraded to 0.2.0_4, same segfault.

albanD · November 14, 2017, 4:34pm

Hi,

Could you provide a minimal example to reproduce the segfault please so that we can look into it in more details locally.

Thanks.

purrfegt · November 14, 2017, 4:46pm

adapted from the example in http://pytorch.org/tutorials/advanced/numpy_extensions_tutorial.html:

import torch
print(torch.__version__)

class DummyFunction(torch.autograd.Function):
    def forward(self, x):
         self.save_for_backward(x)
         y = x.numpy() + 1
         return torch.from_numpy(y)

     def backward(self, y_grad):
         x = self.saved_tensors
         x_grad = y_grad.clone()
         x_grad = x_grad.numpy() + 1
         return torch.from_numpy(x_grad)
         # return torch.FloatTensor(x_grad)

input = torch.autograd.Variable(torch.rand(5), requires_grad=True)
output = DummyFunction()(input)
print(output)
output.backward(torch.rand(5))
print(input.grad)

This code outputs:

0.2.0_4
Variable containing:
 1.4786    
 1.8203
 1.4139
 1.2448
 1.8730
[torch.FloatTensor of size 5]

Segmentation fault (core dumped)

However, if I replace torch.from_numpy(…) with

ret = torch.zeros(x_grad.shape)
retnp = ret.numpy()
retnp += x_grad

and return ret, it works

albanD · November 14, 2017, 4:50pm

Hi,

Your original code works with my install from master.
How did you installed pytorch?

purrfegt · November 14, 2017, 4:51pm

installed it from conda for python 2.7, no cuda (Linux)

purrfegt · November 14, 2017, 4:54pm

asked a colleague to reproduce (Mac, no CUDA, pip-installed, 0.2.0_3), outputs

Segmentation Fault: 11

(edit: colleague ran example at http://pytorch.org/tutorials/advanced/numpy_extensions_tutorial.html#parametrized-example)

purrfegt · November 14, 2017, 5:21pm

also setup a fresh amazon instance, only installed python, pip, ipython and ran:

pip install http://download.pytorch.org/whl/cu75/torch-0.2.0.post3-cp27-cp27mu-manylinux1_x86_64.whl

(Python 2.7, no CUDA, pip-installed 0.2.0_3 (didn’t install torchvision))

The dummy example I pasted above also gives segfault on the fresh machine.

richard · November 14, 2017, 5:27pm

Okay, I was able to repro the problem. Building from source (master) or the v0.3.0 branch makes the segfault go away on the machine I reproduced on. @purrfegt could you try installing from source on a machine and check if the segfault is still there?

purrfegt · November 14, 2017, 6:22pm

running into problems with cmake, no time to resolve now.
Do you think the segfault will be resolved in prebuilt v0.3.0? Maybe this needs a test if there isn’t any yet.

richard · November 14, 2017, 6:36pm

I tested on v0.3.0 and the segfault isn’t there, so it’ll probably be fine

We’ll be testing pre-built v0.3.0 binaries on the tutorials so the segfault will be caught if it still exists.

purrfegt · November 14, 2017, 6:44pm

ok thanks. Any estimate when v0.3 is arriving? I think I can live with my workaround for now, not using it in anything crucial.

richard · November 14, 2017, 6:47pm

Soon. You can follow along with the progress here: https://github.com/pytorch/pytorch/issues/3649