In the doc it says:
retain_variables (bool): If ``True``, buffers necessary for computi
133 gradients won't be freed after use. It is only necessary to
134 specify ``True`` if you want to differentiate some subgraph mul
135 times (in some cases it will be much more efficient to use
136 `autograd.backward`).
one way to understand it is to “keep all variables or flags associated with computing gradients” while retain_variables=False
make values of variables or flags gone.
However, I am not sure I understand it properly, I would like to see the differences in codes and outputs. So, I tried to look into the source code, and here is as far as I can go below:
def backward(self, gradient=None, retain_variables=False):
117 """Computes the gradient of current variable w.r.t. graph leaves.
118
119 The graph is differentiated using the chain rule. If the variable is
120 non-scalar (i.e. its data has more than one element) and requires
121 gradient, the function additionaly requires specifying ``gradient``.
122 It should be a tensor of matching type and location, that contains
123 the gradient of the differentiated function w.r.t. ``self``.
124
125 This function accumulates gradients in the leaves - you might need to z
126 them before calling it.
127
128 Arguments:
129 gradient (Tensor): Gradient of the differentiated function
130 w.r.t. the data. Required only if the data has more than one
131 element. Type and location should match these of ``self.data``.
132 retain_variables (bool): If ``True``, buffers necessary for computi
133 gradients won't be freed after use. It is only necessary to
134 specify ``True`` if you want to differentiate some subgraph mul
135 times (in some cases it will be much more efficient to use
136 `autograd.backward`).
137 """
138 if self.volatile:
139 raise RuntimeError('calling backward on a volatile variable')
140 if gradient is None and self.requires_grad:
141 if self.data.numel() != 1:
142 raise RuntimeError(
143 'backward should be called only on a scalar (i.e. 1-element
144 'or with gradient w.r.t. the variable')
145 gradient = self.data.new().resize_as_(self.data).fill_(1)
146 -> self._execution_engine.run_backward((self,), (gradient,), retain_variab
return None
Apparently, to see what exact does retain_variables
do in run_backward
, I have to at least a level deeper, but step
in pdb
won’t take me there, it just return None
. So, I am stuck.
could anyone help me here? Thanks