Hi,
I’m currently trying to implement a simple SGD without using a built in optimizer.
What I’m running into is that after the first mini-batch, my loss loses its requires_grad status, and thus throws an error.
Here is my code:

``````import numpy as np
import torch

def generate_data():
data = torch.rand(1000, 2)
label = ((data[:,0]+0.3*data[:,1]) > 0.5).to(torch.int)
return data[:,0], label

input, label = generate_data()

# Make minibatches.
inputs = torch.split(input, 32)
labels = torch.split(label, 32)

# Define the two variables to optimize

for epoch in range(15):
for x, y in zip(inputs,labels):

# Calculate p_x as per formula above
p_x = 1 / (1 + torch.exp(-(b1 + b2 * x)))

# Calculate the negative loss likelihood
l = (y * torch.log(p_x) + (1 - y) * torch.log(1 - p_x)).sum()

l.backward()
# Calculate the gradient of the loss w.r.t. the inputs
# Update the parameters b according to SGD formula

b1 = b1 - 0.01 * delta_b1
b2 = b2 - 0.01 * delta_b2

``````

P.S. I am aware Variable is deprecated, but I am working around given code.
Any help would be greatly appreciated.

You are replacing the trainable and deprecated `Variable`s inside the `no_grad` block with constant tensors. You could try to update these variables via inplace methods instead.

1 Like

I see. Is it acceptable to change to tensor.data directly? Like so:

``````b1.data -= 0.1 * delta_b1
b2.data -= 0.1 * delta_b2
``````

Otherwise I get:
RuntimeError: a leaf Variable that requires grad is being used in an in-place operation.

Since you are already using deprecated `Variable` objects you could also use the internal `.data` attribute, but note that I would strongly recommend updating your code and to use supported approaches.

1 Like

Ok, i really appreciate the help!