In the doc it says:
retain_variables (bool): If ``True``, buffers necessary for computi 133 gradients won't be freed after use. It is only necessary to 134 specify ``True`` if you want to differentiate some subgraph mul 135 times (in some cases it will be much more efficient to use 136 `autograd.backward`).
one way to understand it is to “keep all variables or flags associated with computing gradients” while
retain_variables=False make values of variables or flags gone.
However, I am not sure I understand it properly, I would like to see the differences in codes and outputs. So, I tried to look into the source code, and here is as far as I can go below:
def backward(self, gradient=None, retain_variables=False): 117 """Computes the gradient of current variable w.r.t. graph leaves. 118 119 The graph is differentiated using the chain rule. If the variable is 120 non-scalar (i.e. its data has more than one element) and requires 121 gradient, the function additionaly requires specifying ``gradient``. 122 It should be a tensor of matching type and location, that contains 123 the gradient of the differentiated function w.r.t. ``self``. 124 125 This function accumulates gradients in the leaves - you might need to z 126 them before calling it. 127 128 Arguments: 129 gradient (Tensor): Gradient of the differentiated function 130 w.r.t. the data. Required only if the data has more than one 131 element. Type and location should match these of ``self.data``. 132 retain_variables (bool): If ``True``, buffers necessary for computi 133 gradients won't be freed after use. It is only necessary to 134 specify ``True`` if you want to differentiate some subgraph mul 135 times (in some cases it will be much more efficient to use 136 `autograd.backward`). 137 """ 138 if self.volatile: 139 raise RuntimeError('calling backward on a volatile variable') 140 if gradient is None and self.requires_grad: 141 if self.data.numel() != 1: 142 raise RuntimeError( 143 'backward should be called only on a scalar (i.e. 1-element 144 'or with gradient w.r.t. the variable') 145 gradient = self.data.new().resize_as_(self.data).fill_(1) 146 -> self._execution_engine.run_backward((self,), (gradient,), retain_variab return None
Apparently, to see what exact does
retain_variables do in
run_backward, I have to at least a level deeper, but
pdb won’t take me there, it just
return None. So, I am stuck.
could anyone help me here? Thanks