Why don't model parameter learn?

Hi, everyone :slightly_smiling_face:
I am studying Inverse Rational Control.
I want to make an experimentalist that infers the real parameter of the agent.

Don’t update the model parameter.
Below my code

coef is a model parameter I want to find.
coef is a tensor size 2

after backward(), coef.grad is None and loss.grad is always [1.]

Does someone help me?


You are detaching the loss tensor by wrapping it into the deprecated Variable, so remove this line of code.

PS: you can post code snippets by wrapping them into three backticks ```, which would make debugging easier :wink:

Thanks reply @ptrblck .
Of course, I have tried it, but coefficients(parameter) do not change

def exp_inverse(true_coef, arg, env, agent, x_traj, a_traj, true_loss, filename, n, part_coef=False):
    tic = time.time()

    # Parameterization of coefficient to apply optimizer
    coef = nn.Parameter(reset_coef(arg.desvel_range, arg.uns4IDM_range))
    ini_coef = coef.data.clone()

    loss_log = deque(maxlen=arg.NUM_IT)
    loss_log_recent = deque(maxlen=100)
    coef_log = deque(maxlen=arg.NUM_IT)
    coef = Variable(coef, requires_grad=True)

    # Set-up Optimizer & Scheduler
    opt = torch.optim.Adam([coef], lr=arg.ADAM_LR, eps=arg.ADAM_eps, amsgrad=False)
    Sched = torch.optim.lr_scheduler.StepLR(opt, step_size=10, gamma=0.99)

    print('Finding Coefficient')

    for it in tqdm(range(arg.NUM_IT)):
        # Calculate loss function
        loss, _ = getLoss(agent, x_traj, a_traj, coef, env, arg.PI_STD, arg.NUM_SAM)
        # loss = Variable(loss, requires_grad=True)
        loss.requires_grad = True
        # Initialize the gradients calculated previous step; Clears old gradients
        # print(loss.grad)

        # Collect the data for plotting tendency

        # Backpropagation
        # opt.param_groups[0]["params"][0].grad
        # Optimizer and Scheduler
        opt.step() # change coefficients

what is the problem? could you help me again?

if totally delete the line ‘loss.requires_grad =True’, i got the error message like below

Traceback (most recent call last):
  File "/home/bmil/flow-autonomous-driving/SPRING2021/Code/Inverse_LC/Inverse/Trajectory_collect.py", line 324, in <module>
  File "/home/bmil/flow-autonomous-driving/SPRING2021/Code/Inverse_LC/Inverse/Trajectory_collect.py", line 240, in Trajectory_collect
    exp_inverse(true_coef, arg, env, agent, x_traj, a_traj, true_loss, f_name, 2)
  File "/home/bmil/flow-autonomous-driving/SPRING2021/Code/Inverse_LC/Inverse/Inverse_exp.py", line 50, in exp_inverse
  File "/home/bmil/anaconda3/envs/flow/lib/python3.7/site-packages/torch-1.4.0-py3.7-linux-x86_64.egg/torch/tensor.py", line 195, in backward
    torch.autograd.backward(self, gradient, retain_graph, create_graph)
  File "/home/bmil/anaconda3/envs/flow/lib/python3.7/site-packages/torch-1.4.0-py3.7-linux-x86_64.egg/torch/autograd/__init__.py", line 99, in backward
    allow_unreachable=True)  # allow_unreachable flag
RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn

It seems that loss is already detached when it’s returned from the model, which might happen, if you rewrap it into a tensor, use a non-differentiable operation, or use a 3rd party library (e.g. numpy) without writing a custom autograd.Function.
To debug it further, you can check the .grad_fn of the intermediate tensors and locate which tensor returns a None value.

Thanks for your reply. @ptrblck
I reckon that the problem is line ‘state = np.concatenate([only_state, desvel.detach().numpy(), uns4IDM.detach().numpy()])’.
but i can’t use just .numpy() or torch.tensor() because this framework accepts only numpy type array.
What should I do in this case?

def getLoss(agent, x_traj, a_traj, c, env, PI_STD, NUM_SAM):
    # getLoss
    logPr = torch.zeros(1)
    logPr_act = torch.zeros(1)
    logPr_lc = torch.zeros(1)
    logPr_acc = torch.zeros(1)

    desvel, uns4IDM = torch.split(c.view(-1),1)

    env.desvel = desvel
    env.uns4IDM = uns4IDM

    for num_it in range(NUM_SAM):
        logPr_ep = torch.zeros(1)
        logPr_act_ep = torch.zeros(1)
        logPr_lc_ep = torch.zeros(1)
        logPr_acc_ep = torch.zeros(1)

        t = torch.zeros(1)
        state = env.reset()
        only_state = np.delete(state, (14, 15))
        state = np.concatenate([only_state, desvel.detach().numpy(), uns4IDM.detach().numpy()])

        print('\ncalculating loss')
        for it, next_x in tqdm(enumerate(x_traj[1:])):
            action = agent.compute_action(state)
            action_t = torch.Tensor(action)

            lc_loss = ((action_t[1] - a_traj[it][1]) ** 2) / (2 * (PI_STD ** 2)) + ((action_t[2] - a_traj[it][2]) ** 2) / (2 * (PI_STD ** 2)) + ((action_t[3] - a_traj[it][3]) ** 2) / (2 * (PI_STD ** 2))# policy

            logPr_acc_ep = acc_loss.sum() + logPr_acc_ep
            logPr_lc_ep = lc_loss.sum() + logPr_lc_ep
            logPr_act_ep = acc_loss.sum() + lc_loss.sum() + logPr_act_ep
            logPr_ep = logPr_ep + acc_loss.sum() + lc_loss.sum()

            next_state, _, _, _ = env.step(action)

            t += 1
            on_state = np.delete(next_state, (14, 15))
            state = np.concatenate([on_state, desvel.detach().numpy(), uns4IDM.detach().numpy()])

        logPr_acc += logPr_acc_ep
        logPr_lc += logPr_lc_ep
        logPr_act += logPr_act_ep
        logPr += logPr_ep
        print("acc:{}, lc:{}, logPr:{}, logPr_act:{}".format(logPr_acc, logPr_lc, logPr_ep, logPr_act))
    return logPr/NUM_SAM, logPr_act/NUM_SAM

If you need to use numpy operations, you would have to implement the backward operation manually using a custom autograd.Function as described here.

Thank you for your time! @ptrblck