Any difference between optim.Adagrad and other optimizers?

I used to use optimizing algorithm like Adadelta/SGD. But when I change optimizer to Adagrad, I got following error. So, is there any difference between using Adagrad and using other optimizers?

Traceback (most recent call last):
  File "trainer.py", line 187, in <module>
    t.train()
  File "trainer.py", line 102, in train
    train_loss, train_acc = self.train_step(self.data.train)
  File "trainer.py", line 154, in train_step
    self.optimizer.step()
  File "/usr/local/lib/python2.7/dist-packages/torch/optim/adagrad.py", line 80, in step
    state['sum'].addcmul_(1, grad, grad)
TypeError: addcmul_ received an invalid combination of arguments - got (int, torch.cuda.FloatTensor, torch.cuda.FloatTensor), but expected one of:
 * (torch.FloatTensor tensor1, torch.FloatTensor tensor2)
 * (torch.SparseFloatTensor tensor1, torch.SparseFloatTensor tensor2)
 * (float value, torch.FloatTensor tensor1, torch.FloatTensor tensor2)
      didn't match because some of the arguments have invalid types: (int, torch.cuda.FloatTensor, torch.cuda.FloatTensor)
 * (float value, torch.SparseFloatTensor tensor1, torch.SparseFloatTensor tensor2)
      didn't match because some of the arguments have invalid types: (int, torch.cuda.FloatTensor, torch.cuda.FloatTensor)

I use the optim class in the following way:

self.optimizer = optim.Adagrad(filter(lambda p: p.requires_grad, self.model.parameters()),
                                       lr=1e-2, weight_decay=0.1)
self.optimizer.zero_grad()
self.optimizer.step()

I can’t find what’s wrong because it works fine in my computer. so maybe try updating PyTorch to the latest version?

Hi @ShawnGuo, you might try to define your criterion after moving your model’s parameters to gpu.