On using 'with torch.no_grad' and 'detach' during testing

I noticed that during testing the grad would still be true for the tensors we got from the models as outputs. Would using detach() reduce the memory usage, or give any benefit?
The code during testing can be like:

my_model.eval()
with torch.no_grad():
       for batch_idx, batch in enumerate(test_dataloader):            
           imgs =get_imgs(batch)                      
           output  =  my_model(imgs).detach()

Any ideas?

The usage of detach() won’t save additional memory, since output is not attached to anything.
If you try to print the output, you’ll see that the grad_fn is missing, so that detach() won’t change anything.
The gradients (from the previous run) might still be allocated as torch.no_grad() avoids storing intermediate activations, which would be needed for the backward pass.

2 Likes