Custom loss is not updating the weights

I do have list of history of states.
Then by numpy i calculate discounted_rewards
Then i multiply output of model with discounted_rewards using torch.mm
then

  print(self.global_model.state_dict())
    print("total_loss",total_loss)
    total_loss.backward()
    self.opt.step()
    print(self.global_model.state_dict())

it’s output is

(‘dense1.weight’, tensor([[ 0.3997, -0.1907, 0.1120, 0.3016],
[ 0.1156, 0.0646, 0.1802, 0.3558],
[ 0.0321, 0.2537, 0.0879, 0.2441],
[-0.2952, -0.0886, -0.3235, 0.3006]])), (‘dense1.bias’, tensor([ 0.1927, 0.3048, -0.3551, -0.0302])), ('dense2.weig

total_loss.backward() tensor(2.5806, dtype=torch.float64, grad_fn=)

(‘dense1.weight’, tensor([[ 0.3997, -0.1907, 0.1120, 0.3016],
[ 0.1156, 0.0646, 0.1802, 0.3558],
[ 0.0321, 0.2537, 0.0879, 0.2441],
[-0.2952, -0.0886, -0.3235, 0.3006]])), (‘dense1.bias’, tensor([ 0.192

and

self.opt = torch.optim.SGD(self.global_model.parameters(),lr = 0.01)

So it is not updating the weights. what i am missing ?

Model is


        self.dense1 = torch.nn.Linear(4,4) 
        self.dense2 = torch.nn.Linear(4,4) 
        self.dense3 = torch.nn.Linear(4,4) 
        self.dense4 = torch.nn.Linear(4,4) 
        self.probs = torch.nn.Linear(4,2) 
        self.values = torch.nn.Linear(4, 1)

Can you share the loss function code and the part of the program where you have called the criterion?

Hey i have notice that
If i do

            print(self.local_model.state_dict())
            print("total_loss.backward()",total_loss)
            total_loss.backward()
            opt_2 = torch.optim.SGD(self.local_model.parameters(),lr = 0.01)
            opt_2.step()
            self.opt.step()
            print(self.local_model.state_dict())

Then it show weight change.
But i want to update weights using loss of global_model.
I am using local_model to predict the result
then i want to apply grad to global model
A3C stuff
Do you know how to ?

The predictions and weights will be same all the time if you don’t backpropogate local_model and the loss function will always return a constant value because since the gradients of local_model are not being updated,it will give same prediction every time and labels are also constant so if you are able to interpret this then you will notice that loss will come out constant all the time so while you backpropogate through global_model, the gradients will be same.