Why is my optimiser not updating?

I have written a simple sgd optimiser in 1.1.4 as:

class Optimiser():
    def __init__(self,params,lr):
        self.params=params
        self.lr=lr
    def step(self):
        for p in self.params: 
            p.data.add_(-self.lr, p.grad)   
    def zero_grad(self):
        for p in self.params:p.grad.zero_()

My train loop is :

sgd=Optimiser(model.parameters(),0.01)
for epoch in range(epochs):
    for i,data in enumerate(trainloader):
        xb = data[0]
        yb = data[1]
        pred = model(xb)
        loss = loss_func(pred, yb)
        loss.backward()
        sgd.step()
        sgd.zero_grad()

And the loss does not decrease at all. In fact I put a print statement in

def step(self):
        for p in self.params: 
            p.data.add_(-self.lr, p.grad)   
            print('Here')

And ‘Here’ was only printed in the first step and stopped being printed after than meaning somehow I stop entering the update params step. I don’t understand why this would happen.
If instead of opt.step() I call
for p in model.parameters():p.data.add_(-0.01, p.grad)
My model does start learning. So what is happening?

@ptrblck Can you help?

I would say that it’s not even entering because you are exhausting the iterator. Pass the model and call model.parameters() inside step

But pytorch’s optimisers take in model.parameters() and seem to work. Why would mine not?

Because you are not doing the same than them. The code is way different. It’s not like because having the same input it will work the same way.
In simple words, a generator “extracts” parameters (internally) on-demand. Once all the parameters have been extracted the generator is empty.
That leads to the fact:
for p in self.params: that here self.params is empty and that’s why here is not printed.

def __init__(self, params, defaults):
        torch._C._log_api_usage_once("python.optimizer")
        self.defaults = defaults

        if isinstance(params, torch.Tensor):
            raise TypeError("params argument given to the optimizer should be "
                            "an iterable of Tensors or dicts, but got " +
                            torch.typename(params))

        self.state = defaultdict(dict)
        self.param_groups = []

        param_groups = list(params) #HERE THEY ARE CONVERTING THE GENERATOR INTO A LIST, GENERATOR IS ITERATED AND STORED IN THAT LIST
        if len(param_groups) == 0:
            raise ValueError("optimizer got an empty parameter list")
        if not isinstance(param_groups[0], dict):
            param_groups = [{'params': param_groups}]

        for param_group in param_groups:
            self.add_param_group(param_group)

First of all they are saving the iterator in a list. List are iterables which are restarted once they are exhausted.
Generators (which is what you are using interally calling model.parameters()) are iterables which remains empty once iterated.

You can google differences between generators and lists. Besides, you will further find a bug if a gradient is None. Something line NoneType has no attribute data.

1 Like

Ah, thanks. I though it would create a new generator in every instance.