I am facing a problem related to indexing variables elements in my loss function;
How can I get only few position from my vector with no interference in the graph?
model = Foo();
y_pred = model(X)
dist = criterion(y_pred, y)
#Apparently this line breaks the graph;
# How can I get only few position from my vector with no interferring the graph?
dist = dist[dist > 0.]
loss = MarginRankingLoss(dist, 0, 1)
optimize.zero_grad()
loss.backward()
optimize.step()
File “/home/rfelixmg/anaconda2/lib/python2.7/site-packages/torch/nn/_functions/loss.py”, line 155, in backward
input1, input2, y = self.saved_tensors
RuntimeError: Trying to backward through the graph second time, but the buffers have already been freed. Please specify retain_variables=True when calling backward for the first time.
Traceback (most recent call last):
File “/home//anaconda2/lib/python2.7/site-packages/IPython/core/interactiveshell.py", line 2869, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File “”, line 1, in
loss1.backward()
File "/home//anaconda2/lib/python2.7/site-packages/torch/autograd/variable.py”, line 146, in backward
self._execution_engine.run_backward((self,), (gradient,), retain_variables)
File “/home/*/anaconda2/lib/python2.7/site-packages/torch/autograd/function.py”, line 137, in backward
raise NotImplementedError
NotImplementedError
The error seems to be that you tried to call backward twice on the same graph without specifying retain_variables=True during the first backward.
You can also try to change dist = dist[dist > 0.] with dist = dist[(dist > 0.).detatch()] to force the indices to be properly detached.
In any case, your code should work so please let us know if explicitly detaching solves the issue.
Thank you for your feedback.
First, I Tried the detach suggestion. Unfortunately it did not work.
Error:
Traceback (most recent call last):
File “/home/rfelixmg/Dropbox/PROJETOS/temporary/examplar_cnn.py”, line 412, in
model, pkg = main(pkg)
File “/home/rfelixmg/Dropbox/PROJETOS/temporary/examplar_cnn.py”, line 356, in main
bin_target=bin_target)
File “/home/rfelixmg/Dropbox/PROJETOS/temporary/examplar_cnn.py”, line 86, in train
loss.backward(retain_variables=True)
File “/home/rfelixmg/anaconda2/lib/python2.7/site-packages/torch/autograd/variable.py”, line 146, in backward
self._execution_engine.run_backward((self,), (gradient,), retain_variables)
RuntimeError: there are no graph nodes that require computing gradients
However, my next try seems to me that work well:
dist_ = dist_[(dist_ > 0.0).detach()]
dist_ = Variable(dist_.data, required_grad=True)
if batch == 0:
if batch == 0:
loss.backward(retain_variables=True)
else:
loss.backward()
The code is running fine now.
My question is if the gradient will be computed right doing Variable (dist_.data, requires_grad=True);
The main reason why I am getting only the dist_ > 0. is due to dist_ be a similarity matrix with diag(dist_) = 0
If there is any other way to get only the triu elements as a vector it would be perfect.