It’s general question but currently I’m looking at tutorial:
why in optimize_model function - we have
and in select_action we use
on the other hand doc:
mentions only requires_grad
of course I understand we don’t want to compute gradients here - but i don’t fully understand the difference between all those 3 methods…
also If i’m not mistaken in previous versions of pytorch we used volatile=true which was considered more memory efficient (please correct me if i’m wrong) which is now replaced by with torch.no_grad():
so if we used with torch.no_grad(): in optimize_mode, would it be also ok?
could anyone please explain it to me?