Autograd error for sparse tensor, Pytorch 1.6

Hello, I found that the autograd of the multiplication of sparse and dense tensor not work properly.
Sample code is listed below.

import torch                                                                                                                                                         
                                                                                                                                                                      
aa = torch.rand([4, 3], requires_grad=True)                                                                                                                          
ind = torch.tensor(                                                                                                                                                  
    [                                                                                                                                                                
        [0, 0, 1, 2],                                                                                                                                                
        [1, 2, 0, 0]                                                                                                                                                 
    ]                                                                                                                                                                
).long()                                                                                                                                                             
                                                                                                                                                                     
yy = torch.sparse_coo_tensor(ind, aa).to_dense().sum(1)                                                                                                              
bb = torch.rand(3, 4)                                                                                                                                                
                                                                                                                                                                     
yy = torch.einsum('ij,jk->ik', yy, bb) // Without this line, no error occur                                                                                                                   
                                                                                                                                                                     
dx = torch.autograd.grad(yy.sum(), aa)[0]                                                                                                                            
print(dx)

With this code, I face the error message like below:

Traceback (most recent call last):
File "coo_tensor_deriv_test.py", line 16, in <module>                                                                                                                  
    dx = torch.autograd.grad(yy.sum(), aa)[0]
File "/home/sr6/kyuhyun9.lee/env_ML/lib/python3.6/site-packages/torch/autograd/__init__.py", line 192, in grad
    inputs, allow_unused)                                                                                                                                                RuntimeError: view size is not compatible with input tensor's size and stride (at least one dimension spans across two contiguous subspaces). Use .reshape(...) instead. Exception raised from view at /pytorch/aten/src/ATen/native/TensorShape.cpp:1568 (most recent call first):
frame #0: c10::Error::Error(c10::SourceLocation, std::string) + 0x42 (0x7fca078141e2 in /home/sr6/kyuhyun9.lee/env_ML/lib/python3.6/site-packages/torch/lib/libc10.so)
frame #1: at::native::view(at::Tensor const&, c10::ArrayRef<long>) + 0x25c (0x7fca3f7079dc in /home/sr6/kyuhyun9.lee/env_ML/lib/python3.6/site-packages/torch/lib/libtorch_cpu.so)                                                                           
frame #2: <unknown function> + 0x10f9579 (0x7fca3f962579 in /home/sr6/kyuhyun9.lee/env_ML/lib/python3.6/site-packages/torch/lib/libtorch_cpu.so)
frame #3: <unknown function> + 0x1114253 (0x7fca3f97d253 in /home/sr6/kyuhyun9.lee/env_ML/lib/python3.6/site-packages/torch/lib/libtorch_cpu.so)
frame #4: at::Tensor::view(c10::ArrayRef<long>) const + 0xdf (0x7fca3fbb137f in /home/sr6/kyuhyun9.lee/env_ML/lib/python3.6/site-packages/torch/lib/libtorch_cpu.so)
frame #5: <unknown function> + 0xf5a67d (0x7fca3f7c367d in /home/sr6/kyuhyun9.lee/env_ML/lib/python3.6/site-packages/torch/lib/libtorch_cpu.so)
frame #6: at::native::sparse_mask_cpu(at::Tensor const&, at::Tensor const&) + 0x73 (0x7fca3f7c44d3 in /home/sr6/kyuhyun9.lee/env_ML/lib/python3.6/site-packages/torch/lib/libtorch_cpu.so)
frame #7: <unknown function> + 0x1284419 (0x7fca3faed419 in /home/sr6/kyuhyun9.lee/env_ML/lib/python3.6/site-packages/torch/lib/libtorch_cpu.so)
frame #8: <unknown function> + 0xa56530 (0x7fca3f2bf530 in /home/sr6/kyuhyun9.lee/env_ML/lib/python3.6/site-packages/torch/lib/libtorch_cpu.so)
frame #9: at::Tensor c10::Dispatcher::call<at::Tensor, at::Tensor const&, at::Tensor const&>(c10::TypedOperatorHandle<at::Tensor (at::Tensor const&, at::Tensor const&)> const&, at::Tensor const&, at::Tensor const&) const + 0xbc (0x7fca3faa781c in /home/sr6/kyuhyun9.lee/env_ML/lib/python3.6/site-packages/torch/lib/libtorch_cpu.so)
frame #10: at::Tensor::sparse_mask(at::Tensor const&) const + 0x4b (0x7fca3fb8e36b in /home/sr6/kyuhyun9.lee/env_ML/lib/python3.6/site-packages/torch/lib/libtorch_cpu.so)
frame #11: <unknown function> + 0x2e9c6be (0x7fca417056be in /home/sr6/kyuhyun9.lee/env_ML/lib/python3.6/site-packages/torch/lib/libtorch_cpu.so)
frame #12: <unknown function> + 0xa56530 (0x7fca3f2bf530 in /home/sr6/kyuhyun9.lee/env_ML/lib/python3.6/site-packages/torch/lib/libtorch_cpu.so)
frame #13: at::Tensor c10::Dispatcher::call<at::Tensor, at::Tensor const&, at::Tensor const&>(c10::TypedOperatorHandle<at::Tensor (at::Tensor const&, at::Tensor const&)> const&, at::Tensor const&, at::Tensor const&) const + 0xbc (0x7fca3faa781c in /home/sr6/kyuhyun9.lee/env_ML/lib/python3.6/site-packages/torch/lib/libtorch_cpu.so)
frame #14: at::Tensor::sparse_mask(at::Tensor const&) const + 0x4b (0x7fca3fb8e36b in /home/sr6/kyuhyun9.lee/env_ML/lib/python3.6/site-packages/torch/lib/libtorch_cpu.so)
frame #15: at::native::to_dense_backward(at::Tensor const&, at::Tensor const&) + 0x90 (0x7fca3f6ca6f0 in /home/sr6/kyuhyun9.lee/env_ML/lib/python3.6/site-packages/torch/lib/libtorch_cpu.so)
frame #16: <unknown function> + 0x129df90 (0x7fca3fb06f90 in /home/sr6/kyuhyun9.lee/env_ML/lib/python3.6/site-packages/torch/lib/libtorch_cpu.so)
frame #17: <unknown function> + 0x2de5b29 (0x7fca4164eb29 in /home/sr6/kyuhyun9.lee/env_ML/lib/python3.6/site-packages/torch/lib/libtorch_cpu.so)
frame #18: <unknown function> + 0xa56530 (0x7fca3f2bf530 in /home/sr6/kyuhyun9.lee/env_ML/lib/python3.6/site-packages/torch/lib/libtorch_cpu.so)
frame #19: at::Tensor c10::Dispatcher::call<at::Tensor, at::Tensor const&, at::Tensor const&>(c10::TypedOperatorHandle<at::Tensor (at::Tensor const&, at::Tensor const&)> const&, at::Tensor const&, at::Tensor const&) const + 0xbc (0x7fca3faa781c in /home/sr6/kyuhyun9.lee/env_ML/lib/python3.6/site-packages/torch/lib/libtorch_cpu.so)
frame #20: at::to_dense_backward(at::Tensor const&, at::Tensor const&) + 0x4b (0x7fca3f9f8e2b in /home/sr6/kyuhyun9.lee/env_ML/lib/python3.6/site-packages/torch/lib/libtorch_cpu.so)
frame #21: torch::autograd::generated::ToDenseBackward::apply(std::vector<at::Tensor, std::allocator<at::Tensor> >&&) + 0x165 (0x7fca415a3f25 in /home/sr6/kyuhyun9.lee/env_ML/lib/python3.6/site-packages/torch/lib/libtorch_cpu.so)
frame #22: <unknown function> + 0x3375bb7 (0x7fca41bdebb7 in /home/sr6/kyuhyun9.lee/env_ML/lib/python3.6/site-packages/torch/lib/libtorch_cpu.so)
frame #23: torch::autograd::Engine::evaluate_function(std::shared_ptr<torch::autograd::GraphTask>&, torch::autograd::Node*, torch::autograd::InputBuffer&, std::shared_ptr<torch::autograd::ReadyQueue> const&) + 0x1400 (0x7fca41bda400 in /home/sr6/kyuhyun9.lee/env_ML/lib/python3.6/site-packages/torch/lib/libtorch_cpu.so)
frame #24: torch::autograd::Engine::thread_main(std::shared_ptr<torch::autograd::GraphTask> const&) + 0x451 (0x7fca41bdafa1 in /home/sr6/kyuhyun9.lee/env_ML/lib/python3.6/site-packages/torch/lib/libtorch_cpu.so)
frame #25: torch::autograd::Engine::execute_with_graph_task(std::shared_ptr<torch::autograd::GraphTask> const&, std::shared_ptr<torch::autograd::Node>) + 0x37c (0x7fca41bd86bc in /home/sr6/kyuhyun9.lee/env_ML/lib/python3.6/site-packages/torch/lib/libtorch_cpu.so)
frame #26: torch::autograd::python::PythonEngine::execute_with_graph_task(std::shared_ptr<torch::autograd::GraphTask> const&, std::shared_ptr<torch::autograd::Node>) + 0x3c (0x7fca4f37376c in /home/sr6/kyuhyun9.lee/env_ML/lib/python3.6/site-packages/torch/lib/libtorch_python.so)
frame #27: torch::autograd::Engine::execute(std::vector<torch::autograd::Edge, std::allocator<torch::autograd::Edge> > const&, std::vector<at::Tensor, std::allocator<at::Tensor> > const&, bool, bool, std::vector<torch::autograd::Edge, std::allocator<torch::autograd::Edge> > const&) + 0x803 (0x7fca41bd79f3 in /home/sr6/kyuhyun9.lee/env_ML/lib/python3.6/site-packages/torch/lib/libtorch_cpu.so)
frame #28: torch::autograd::python::PythonEngine::execute(std::vector<torch::autograd::Edge, std::allocator<torch::autograd::Edge> > const&, std::vector<at::Tensor, std::allocator<at::Tensor> > const&, bool, bool, std::vector<torch::autograd::Edge, std::allocator<torch::autograd::Edge> > const&) + 0x4e (0x7fca4f37356e in /home/sr6/kyuhyun9.lee/env_ML/lib/python3.6/site-packages/torch/lib/libtorch_python.so)
frame #29: THPEngine_run_backward(THPEngine*, _object*, _object*) + 0xa54 (0x7fca4f374254 in /home/sr6/kyuhyun9.lee/env_ML/lib/python3.6/site-packages/torch/lib/libtorch_python.so)
<omitting python frames>
frame #43: __libc_start_main + 0xf5 (0x7fca58637c05 in /lib64/libc.so.6)
frame #44: python() [0x400bda]

I cannot figure out the source of error. What is the source of this error?

Hi,

I am afraid the error comes from the fact that einsum does not support autograd for sparse inputs.
You can open a feature request on github if you want this, but this is most likely going to be quite a bit of work.

Thank you, albanD.

Actually, I use this kind of tensor (sparsetensor.to_dense().sum()) with torch.nn.Linear.
I’m not sure whether torch.nn.Linear use einsum but error with torch.nn.Linear might be the same problem (lack of support for sparse tensor).
I’ll open a feature request soon.

Ho actually the stack trace points to ToDenseBackward indeed. So the .to_dense() might be the issue here.
But as mentioned above, the backward for sparse Tensor is not well supported I’m afraid.
You can try to do to_dense().clone().sum(1) ?

Unfortunately, to_dense().clone().sum() does not work. :frowning:
But thank you for your kind suggestion!

p.s. I open a feature request for this problem.

1 Like