Is there a reason why the default value of
Adam.zero_grad() isn’t changed to
True? The current default results in parameter update even though they aren’t used to compute the loss. I suppose this also holds true for other similar optimisers. Setting it to
True seems like what you would expect, no?