Size of gradient of output image with respect to input image?


I’m developing a model that takes a 3-channel input image, and outputs a 3-channel output image of the same size (256 x 256). I’m trying to get the gradient of the output image with respect to the input image. My code looks like below:

img_input = torch.autograd.Variable(img_input_tensor, requires_gradient=True)
img_output = model.forward(img_input)
img_output.backward(gradient=torch.ones(img_input.size(), retain_variables=True)
grad_img_input = img_input.grad

The output gradient shape is 3 x 256 x 256.

I’m having an understanding difficulty here: isn’t the gradient of a (3x256x256)-element vector with respect to (3x256x256) variables of shape (3x256x256, 3x256x256)? Or what gradient is being returned in img_input.grad?

img_output = model.forward(img_input)

This is Incorrect. Has to be

img_output = model(img_input)

You are treating this as a scalar gradient, and hence you treat grad_img_input as getting wrt a scalar variable 1.


to expand a bit on smth’s explanation:

this is (mathematically) equivalent to

img_output_sum = img_output.sum()

only that you computed the gradient of the sum (which is all ones) manually.

If you wanted the complete derivative, you would have many backwards (with keep_variables/keep_graph).

Best regards


Thanks Soumith and Tom for replying!

@Tom: what do you mean by many backwards? How would this look for a simple example? Say for the case when img_input and img_output are both 1x2x2 images, how do I get the gradient of each element (or a subset of elements) in img_output with respect to each element of img_input?

Hello @adrianalbert,

well, in theory you could do

model = nn.Conv2d(1,1,3,padding=1, bias=False) # you'll have a better model
img_input = Variable(torch.ones(1,1,2,2), requires_grad=True)
img_output = model(img_input)

onepix = torch.FloatTensor(1,1,2,2)
for x in range(2):
    for y in range(2):
        onepix[0,0,x,y] = 1
        img_input.grad = None
        img_output.backward(gradient=onepix, retain_graph=True)
        print (x,y,img_input.grad)

(with retain_graph being retain_variables for <= 0.1.12).

Of course, that is cool for 1x2x2 images to check things out but not anything you would want to use on a larger scale.

Best regards


Thanks @tom! This is very useful. I only need local gradients, so I ended up doing an average over a local patch of interest.

image_variable = Variable(transformedimageinput, requires_grad=True)
prediction = inceptionfeaturesmodel(image_variable)
prediction.backward(gradient=torch.ones( prediction.size()), retain_variables=True)

image_variable has a size [1, 3, 299, 299]
prediction has size [1, 1000]

After .backward() shouldn’t the image_variable.grad have size [1, 3, 299, 299, 1000] since it is essentially a Jacobian?
But it has same dimensions as input. ie [1, 3, 299, 299]