Detach, no_grad and requires_grad

torch.no_grad yes you can use in eval phase in general.

detach() on the other hand should not be used if you’re doing classic cnn like architectures. It is usually used for more tricky operations.
detach() is useful when you want to compute something that you can’t / don’t want to differentiate. Like for example if you’re computing some indices from the output of the network and then want to use that to index a tensor. The indexing operation is not differentiable wrt the indices. So you should detach() the indices before providing them.

19 Likes