Hi all, I’d like to implement CP decomposition to downsize the tensors. As far as I know, PyTorch does not support these factorization for more than 2D. (I know for 2D there are some factorization functions)

CP decomposition factorize `I*J*K`

tensor (we call it X) into `I*R, J*R, K*R`

tensors (U, V, W respectively). Naively, I use stochastic gradient descent to get U, V and W. When composing, you need to add R tensors `Y_1, Y_2,...Y_R`

, where

```
Y_r = outer(outer(U[:,r], V[:,r]), W[:,r])
```

I implemented this naively as below,

```
def outer(t1, t2):
h, w = t1.size()
c = t2.size()[0]
return t2.repeat(h, w, 1) * t1.unsqueeze(2).repeat(1, 1, c)
out = variable(torch.Tensor(I, J, K))
for r in range(R):
out += outer(torch.ger(U[:, r], V[:, r]), W[:, r])
```

When X is small then it works, but when X is large OutofMemoryError is thrown because `Y_r`

is `I*J*K`

tensor.

I think there might be two way to avoid this error, first is get `ΣY_r`

directly from `U, V, W`

instead of composing tensors from vectors. The other is creating and adding `outer(torch.ger(U[:, r], V[:, r]), W[:, r])`

effectively. But still I don’t have any idea to improve.

Does anyone has good solutions to this? Thank you for advance.