While computing VGG Perceptual loss, although I have not seen, I feel it is alright to wrap the computation of VGG features for the GT image inside
So basically I feel the following will be alright,
with torch.no_grad(): gt_vgg_features = self.vgg_features(gt) nw_op_vgg_features = self.vgg_features(nw_op) # Now compute L1 loss
or one should use,
gt_vgg_features = self.vgg_features(gt) nw_op_vgg_features = self.vgg_features(nw_op)
In both approaches
requires_grad for VGG parameters is set
False and VGG put in
The first approach will save a lot of GPU resources and feel should be numerically equal to the second one as no backpropagation is required through GT images. But in most implementations, I find the second approach used for computing VGG perceptual loss.
So which option we should go for implementing VGG perceptual loss?