Thanks for your reply.
Setting the target class for gradient=
is used when visualizing what the model does w.r.t. to given input image. (To see what the model focusses on when classifying objects)
See https://github.com/utkuozbulak/pytorch-cnn-visualizations#gradient-visualization.
As I am not doing classification but rather regression, I tried to modify the above source code.
Given an input image, my model should predict the pixelwise scene-depth (i.e. distance to the camera in meters). Now, I want to visualize what features my model thinks are important for doing depth prediction.
General question: Is it correct to assume that a big dy/dx
(i.e. big x.grad
) for a particular pixel means that the model pays more attention to that pixel?
After following your derivations, I think that setting gradient=
to my ground-truth depth does not make any sense at all. This would only introduce a weighting factor that prefers pixels which are further away from the camera (but could potentially be still very important for the overall depth prediction). I think it would make more sense to just set gradient=
to torch.ones(target.shape)
.
Is it possible to use LaTeX math markdown or similar? This plain-text math is awful to read!
I can’t follow your derivation: loss = sum(y * target)
, dloss / dy = target
. I think it should rather equate to dloss / dy = sum(target)
. And therefore: dloss / dy * dy / dw = sum(target) * dy / dw
.
Anyway, I think your introduction of an equivalent loss function really confuses me.