Hello everybody!
I want to write a custom nn.Linear layer, but with weights in the form of a sparse matrix.
Please tell me how expensive it is to create a sparse matrix “on the fly” in order to calculate the gradient to the tensor of its values?
And how can this code below be optimized?
If I don’t create a new sparse matrix at each iteration of the loop, then an exception is thrown after the first iteration: “RuntimeError: Trying to backward through the graph a second time (or directly access saved tensors after they have already been freed). Saved intermediate values of the graph are freed when you call .backward() or autograd.grad(). Specify retain_graph=True if you need to backward through the graph a second time or if you need to access saved tensors after calling backward.”
And I can’t beat this error in any way
Code example:
import torch
import torch.optim as optim
if __name__ == '__main__':
indices = [[1], [0]]
values = torch.FloatTensor([10.0]).requires_grad_()
target = torch.FloatTensor(
[
[1],
[2]
]
)
inputs = torch.FloatTensor(
[
[1],
[2],
[3]
]
)
optimizer = optim.Adam([values], lr=0.1)
for _ in range(200):
tensor = torch.sparse_coo_tensor(indices, values, (2, 3))
y = torch.sparse.mm(tensor, inputs)
loss = ((target - y) ** 2).sum()
print(y.tolist(), values)
optimizer.zero_grad()
loss.backward()
optimizer.step()