That actually works beautifully - I was scared that PyTorch didn’t support that kind of thing. I’m really impressed that you can pluck individual parts of your input vector and put them in your cost function and it still works. Man I love PyTorch.
I feel like I’m really close, I’m just getting another pesky error:
def helper(x):
return 2 * x[0] + 3 * x[1]**2
x = autograd.Variable(torch.FloatTensor([1, 2]), requires_grad=True)
total = autograd.Variable(torch.FloatTensor([0]))
for i in range(3): # number of epochs
print("iteration: " + str(i))
total += helper(x)
total.backward()
x = x - 0.1 * x.grad # a gradient step with a learning rate of 0.1
print(x.grad) # when that math in the line above is done, the gradient vanishes…
# shouldn’t I have to clear it manually?
print(x.grad[0], x.grad[1]) # never gets here
---------------------- Output --------------------
iteration: 0
None
iteration: 1
Unfortunately, on the second iteration of total.backward(), I get:
RuntimeError: element 0 of variables tuple is volatile
Again, thank you so much for your help. It’s been invaluable.
EDIT 1:
Another thing I tried is:
def helper(x):
return 2 * x[0] + 3 * x[1]**2
x = autograd.Variable(torch.FloatTensor([1, 2]), requires_grad=True)
total = autograd.Variable(torch.FloatTensor([0]))
for i in range(3): # number of epochs
print("iteration: " + str(i))
total += helper(x)
total.backward()
x = x - 0.1 * x.grad # a gradient step with a learning rate of 0.1
x.grad.data.zero_() # ADDITION: zeroing out the gradient
print(x.grad[0], x.grad[1]) # never gets here
Per this SO thread:
But it doesn’t seem like I need to, because this is the error I get when it hits x.grad.data.zero_():
AttributeError: ‘NoneType’ object has no attribute ‘data’
It’s already been cleared. Will continue to research.
EDIT 2:
I think I’m getting closer. I didn’t look as closely at the SO post as I should have, apparently I needed to put .data on everything (for reasons I’m not sure of):
def helper(x):
return 2 * x[0] + 3 * x[1]**2
x = autograd.Variable(torch.FloatTensor([1, 2]), requires_grad=True)
total = autograd.Variable(torch.FloatTensor([0]))
for i in range(3): # number of epochs
print("iteration: " + str(i))
total += helper(x)
total.backward()
x.data = x.data - 0.1 * x.grad.data # ADDITION: added .data to everything per the SO post
x.grad.data.zero_()
print(x.grad[0], x.grad[1]) # never gets here
Now I get the error about retaining the graph:
RuntimeError: Trying to backward through the graph a second time, but the buffers have already been freed. Specify retain_graph=True when calling backward the first time.
Which makes sense, since the gradient was cleared. I’m going to try moving total into the loop.
EDIT 3:
Here is the code and the results of moving it into the for loop:
def helper(x):
return 2 * x[0] + 3 * x[1]**2
x = autograd.Variable(torch.FloatTensor([1, 2]), requires_grad=True)
for i in range(3): # number of epochs
print("iteration: " + str(i))
total = autograd.Variable(torch.FloatTensor([0]))
total += helper(x)
total.backward()
x.data = x.data - 0.1 * x.grad.data
x.grad.data.zero_()
print(x.grad[0], x.grad[1]) # never gets here
iteration: 0
iteration: 1
iteration: 2
Variable containing:
0
[torch.FloatTensor of size 1]
Variable containing:
0
[torch.FloatTensor of size 1]
Which is unexpected. I was expecting those both to be non-zero.
I see your post; I’ll test it out and reply with my results. Thanks again!