I want to write a custom nn.Linear layer, but with weights in the form of a sparse matrix.
Please tell me how expensive it is to create a sparse matrix “on the fly” in order to calculate the gradient to the tensor of its values?
And how can this code below be optimized?
If I don’t create a new sparse matrix at each iteration of the loop, then an exception is thrown after the first iteration: “RuntimeError: Trying to backward through the graph a second time (or directly access saved tensors after they have already been freed). Saved intermediate values of the graph are freed when you call .backward() or autograd.grad(). Specify retain_graph=True if you need to backward through the graph a second time or if you need to access saved tensors after calling backward.”
And I can’t beat this error in any way
import torch import torch.optim as optim if __name__ == '__main__': indices = [, ] values = torch.FloatTensor([10.0]).requires_grad_() target = torch.FloatTensor( [ ,  ] ) inputs = torch.FloatTensor( [ , ,  ] ) optimizer = optim.Adam([values], lr=0.1) for _ in range(200): tensor = torch.sparse_coo_tensor(indices, values, (2, 3)) y = torch.sparse.mm(tensor, inputs) loss = ((target - y) ** 2).sum() print(y.tolist(), values) optimizer.zero_grad() loss.backward() optimizer.step()