Parameters seem not being updated in reinforce algorithm

Is it possible for the following updates happen?

for loop:
  y1 = nn.Module.sigmoid(x)
  y_list.append(y1.data[0][0])
prob = nn.Module.Softmax(y_list)
choices = prob.Multinomial()

and then:

autograd.backward(choices, [None for _ in choices])

I wanna backpropagate like that, but I got the following error:

RuntimeError: there are no graph nodes that require computing gradients

I think that y_list.append is the reason for that error, (graph is broken?) is that right? If not, then why?

Thanks a lot

update: I misunderstood with .data , now I’ve solved it.

sigmoid has no learnable parameters. softmax non plus. Assuming your x has requires_grad set to False, which is the default, if you didnt set it to True, then there are no learnable parameters in your network.

You probably want some nn.Linear or similar in your network.

by the way, calling .data strips the gradients from a variable. I doubt you want to do that :slight_smile:

Thanks for reply
Actually forward function in module is something like:

for i in range(n):
   x1 = Variable(z.view(1,-1),requires_grad=True)
   fc_input = torch.cat((z,state),1)
   fc_out = self.fc1(fc_input)
   y1 = nn.Module.sigmoid(x)
   y_list.append(y1.data[0][0])
   prob = nn.Module.Softmax(y_list)
   choices = prob.Multinomial()
return choices

I want normalize a series of sigmoid output using softmax, and take that normalized activation to somehow calculate a reward. Then take that reward to update fc linear layer by reinforce algorithm.

I’ve tried:
choices = Variable(choices.data, requires_grad=True)
I didn’t get that RuntimeError anymore, but the parameters in module remained unchanged.
Now that .data strips gradients, how can I apply softmax to a list consisted of sigmoid output (which contains Variables)?
Thanks

I’m not quite sure why you want to use .data, and why you want to strip the gradients? If you strip the gradients, of course backprop wont do anything.

I misuse that, now it works. Thank you!

1 Like