Adding a float variable as an additional loss term works?

Suppose I have a loss term as shown below:

G_sample = G(z)
D_fake = D(G_sample)

# Generator loss
G_loss = 0.5 * torch.mean((D_fake - 1)**2)

G_loss.backward()
G_solver.step()
reset_grad()

If I add any ordinary float variable to G_loss before .backward(),
for example,
G_loss = G_loss + avg_pixel_value

Does avg_pixel_value has any effect when doing back prop?
Or it is completely ignored?

Since avg_pixel_value itself cannot use .backward() api, I have a doubt of its effectiveness.

My question is can I add ordinary float terms to PyTorch Loss?

Many thanks.

To differentiate a sum of two terms, you differentiate each term separately. Hence adding a constant value to your loss has no effect whatsoever on the gradients produced by backpropagation.

Thanks for the reply.

Even though the additional term I mentioned doesn’t support .backward(), it’s not a constant but a variable(python float).

In my sample code,
I think avg_pixel_value might affect the value of G_loss.
(avg_pixel_value changes every iteration)

G_loss = G_loss + avg_pixel_value

Is there any faults on my understanding?

avg_pixel_value is calculated from the input image without using any of the model parameters or outputs, so for backpropagation with respect to the model parameters, it is ignored.

Might I suggest a lecture on backpropagation by Andrej Karpathy.

Thanks for the answer.

What if I use model’s output as blow:
G_loss = G_loss + mean_output_pixel_value

But mean_output_pixel_value is merely a float variable that doesn’t have graph information.

Can mean_output_pixel_value affect back propagation?

1 Like

Well, when you do G_loss.backward() pytorch first looks at what operation produced G_loss, in this case

G_loss = G_loss + mean_output_pixel_value

so pytorch retrieves the values that were summed and basically runs backward on both of them.
But mean_output_pixel_value is either a float value which has no .backward() method, or it is a Variable calculated from input_image which is a Variable with requires_grad=False. Therefore mean_output_pixel_value.backward() does nothing.

I shall try another explanation… The reason for running G_loss.backward() is to calculate a measure of how each parameter affected the value of G_loss, this measure is stored in param.grad for each parameter.

Now obviously, the model parameters have no effect whatsoever on the mean_output_pixel_value, so adding it to G_loss will not change the calculated param.grad values in any way at all.

I’m still confused.

After run
G_loss = G_loss + mean_output_pixel_value,

G_loss will be updated.

Back propagation will be computed via multiplying G_loss and the input values of each later layers.

That’s why I thought that the updated G_loss will have some effect during backprop.

When you write G_loss = G_loss + mean_output_pixel_value python first calculates G_loss + mean_output_pixel_value and stores the result in a new tensor, then python updates the name G_loss to point to this result.

That line does not update the data stored in G_loss. It basically acts like this: new_G_loss = old_G_loss + mean_output_pixel_value.

If you wanted to do an inplace addition, you could do

G_loss += mean_output_pixel_value

but that would either be non-differentiable, or pytorch would assume you meant new_G_loss = old_G_loss + mean_output_pixel_value

2 Likes

Thanks for the great answer

1 Like