Hello again

Sorry for a lot of question in this week

When I construct simple network

```
model_NN = nn.Sequential(nn.Linear(4096,100),
nn.Linear(100,2))
optim = torch.optim.SparseAdam(model_NN.parameters(),lr = lr)
myloss = nn.CrossEntropyLoss()
epochs = 1000
torch.manual_seed(48)
my_x = train_x_tensor.view(train_x_tensor.size(0),-1)
for epoch in range(epochs):
prediction = model_NN(my_x)
loss = myloss(train_y_tensor,prediction)
optim.zero_grad()
loss.backward()
optim.step()
print("Loss in epoch {}/{} is {}".format(epoch,epochs,loss.item()))
```

Program rise

```
AssertionError: nn criterions don't compute the gradient w.r.t. targets - please mark these tensors as not requiring gradients
```

I found that more than 50% of Tensor is zero(Sparse Tensor). How can I tackle with this problem with nn class?