In-place Cholesky Decomposition


I am using PyTorch’s Cholesky decomposition to compute the Cholesky factor of a very large matrix (16 GB).

I noticed that the peak memory consumption is doubled when calling torch.linalg.cholesky and the reason seems to be memory allocation for the returned tensor.

This causes memory overflow on my 24 GB GPU. I am wondering if there is an option in PyTorch to do in-place Cholesky updates? Note that I do not need to compute the backward pass – it is okay to completely override the input matrix. In-place updates should be theoretically possible to save memory.