I used to use optimizing algorithm like Adadelta/SGD. But when I change optimizer to Adagrad, I got following error. So, is there any difference between using Adagrad and using other optimizers?
Traceback (most recent call last):
File "trainer.py", line 187, in <module>
t.train()
File "trainer.py", line 102, in train
train_loss, train_acc = self.train_step(self.data.train)
File "trainer.py", line 154, in train_step
self.optimizer.step()
File "/usr/local/lib/python2.7/dist-packages/torch/optim/adagrad.py", line 80, in step
state['sum'].addcmul_(1, grad, grad)
TypeError: addcmul_ received an invalid combination of arguments - got (int, torch.cuda.FloatTensor, torch.cuda.FloatTensor), but expected one of:
* (torch.FloatTensor tensor1, torch.FloatTensor tensor2)
* (torch.SparseFloatTensor tensor1, torch.SparseFloatTensor tensor2)
* (float value, torch.FloatTensor tensor1, torch.FloatTensor tensor2)
didn't match because some of the arguments have invalid types: (int, torch.cuda.FloatTensor, torch.cuda.FloatTensor)
* (float value, torch.SparseFloatTensor tensor1, torch.SparseFloatTensor tensor2)
didn't match because some of the arguments have invalid types: (int, torch.cuda.FloatTensor, torch.cuda.FloatTensor)
I use the optim class in the following way:
self.optimizer = optim.Adagrad(filter(lambda p: p.requires_grad, self.model.parameters()),
lr=1e-2, weight_decay=0.1)
self.optimizer.zero_grad()
self.optimizer.step()