Manually compute and assign gradients of model parameters and Variables

JakobHavtorn · February 26, 2018, 11:34am

Thanks for your answer!

Yes I’m aware of that, but I’ve become a bit confused looking at threads like the below for two reasons

As far as I understand, I cannot readily assign gradients to the .grad field of a model parameter since the gradient buffers are initialized lazily (on .backward()) and thus are None before a (dummy) .backward() has been performed. It would be nice to have this functionality, but I can do without.
(see Problem on Variable.grad.data?)
I suspect that my current implementation dynamically grows the computational graph at each gradient update by saving the computational history as discuseed in What is the recommended way to re-assign/update values in a variable (or tensor)? since computation time increases for each iteration, but I’m not sure why. See also How does one make sure that the parameters are update manually in pytorch using modules?.

My “thoughts”:

# Compute the gradients, returning a list of Tensors
gradients = compute_gradients(input)
# Assign the gradients; but in which way?
for layer, p in enumerate(model.parameters()):
    # (1) This?
    p.grad.data = gradients[layer]

    # (2) What about this? (http://pytorch.org/docs/master/tensors.html#torch.Tensor.set_)
    p.grad.data.set_(gradients[layer])

    # (3) or this
    p.grad = Variable(gradients[layer])

    # (4) or versions using ._grad instead
    p._grad.data = gradients[layer]
    p._grad.data.set_(gradients[layer])
    p._grad = Variable(gradients[layer])