To differentiate a sum of two terms, you differentiate each term separately. Hence adding a constant value to your loss has no effect whatsoever on the gradients produced by backpropagation.

avg_pixel_value is calculated from the input image without using any of the model parameters or outputs, so for backpropagation with respect to the model parameters, it is ignored.

Well, when you do G_loss.backward() pytorch first looks at what operation produced G_loss, in this case

G_loss = G_loss + mean_output_pixel_value

so pytorch retrieves the values that were summed and basically runs backward on both of them.
But mean_output_pixel_value is either a float value which has no .backward() method, or it is a Variable calculated from input_image which is a Variable with requires_grad=False. Therefore mean_output_pixel_value.backward() does nothing.

I shall try another explanationâ€¦ The reason for running G_loss.backward() is to calculate a measure of how each parameter affected the value of G_loss, this measure is stored in param.grad for each parameter.

Now obviously, the model parameters have no effect whatsoever on the mean_output_pixel_value, so adding it to G_loss will not change the calculated param.grad values in any way at all.

When you write G_loss = G_loss + mean_output_pixel_value python first calculates G_loss + mean_output_pixel_value and stores the result in a new tensor, then python updates the name G_loss to point to this result.

That line does not update the data stored in G_loss. It basically acts like this: new_G_loss = old_G_loss + mean_output_pixel_value.

If you wanted to do an inplace addition, you could do

G_loss += mean_output_pixel_value

but that would either be non-differentiable, or pytorch would assume you meant new_G_loss = old_G_loss + mean_output_pixel_value