# Getting RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn even if I included 'requires_grad = True'

Hi, I started studying pytorch recently and I’m stuck in a problem. The codes below trains w, …, w which are the 2 * 2 elements of a weight matrix. The weight matrix is in the function named forward. I want to train w, …, w to produce y from a. But after operating the codes, an error occurred. It is ‘RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn.’ I had tried to find causes but I can’t find and understand them. The original code is much more long, the code below is a simplified version. Would you help me how to fix the problem? Any help would be highly appreciated.

``````import torch
``````
``````eta_n = torch.arange(1.0, 0, - 0.1)
#eta_n = torch.cat([torch.arange(1.0, 0.2, -0.1), torch.arange(0.2, 0, -0.01)], dim = 0)
delta_eta = eta_n[1:] - eta_n[0:-1] #Difference between the elements of eta_n. In this case, every element is -0.1.
eta_fin = eta_n[-1]
print('eta_n:', eta_n)
print('len(eta_n):', len(eta_n))
print('delta_eta:', delta_eta)
print('len(delta_eta) which is the no. of forward propagations:', len(delta_eta))
print('eta_fin:', eta_fin)
``````

eta_n: tensor([1.0000, 0.9000, 0.8000, 0.7000, 0.6000, 0.5000, 0.4000, 0.3000, 0.2000, 0.1000])
len(eta_n): 10
delta_eta: tensor([-0.1000, -0.1000, -0.1000, -0.1000, -0.1000, -0.1000, -0.1000, -0.1000, -0.1000])
len(delta_eta) which is the no. of forward propagations: 9
eta_fin: tensor(0.1000)

``````#Activation function in hidden layers
def acti1(coordinate, layer_no):#layer_no from 0 to '(the length of eta_n) - 1'
x = coordinate
y = coordinate

return torch.cat([x.reshape(1, -1), (y + delta_eta[layer_no] * x ** 3).reshape(1, -1)], dim = 0)
``````
``````#Activation function on the output layer
def acti2(coordinate):
y = coordinate
F = y
return (torch.tanh(100 * (F - 0.1)) - torch.tanh(100 * (F + 0.1)) + 2) / 2
#The result is 0 when F is between -0.5 and 0.5 and for the other F, the result is 1
#This function is steep but continuous and differentiable
``````
``````def forward(coordinate, w, layer_no):#layer_no from 0 to '(the length of eta_n) - 1'
matrix = torch.Tensor([[1, delta_eta[layer_no]], [- delta_eta[layer_no], 1 - delta_eta[layer_no] * w]])

return acti1(matrix.mm(coordinate), layer_no)
``````
``````y = torch.Tensor([0, 0, 1, 1]) #I want y_pred(later mentioned) to be this value
``````
``````the_no_of_iterations = 10
learning_rate = 0.01

a = torch.Tensor([[0.5528, 0.8563, 1.0779, 0.5932], [-0.1109, -0.0569, 0.0904, 0.1435]])

torch.manual_seed(1)
w = torch.randn(9, requires_grad = True) #What I want to train to produce y from a
print('w:', w)

optimizer = torch.optim.Adam([w], lr = learning_rate)

for i in range(the_no_of_iterations):
#Forward propagation
for j in range(9):
a = acti1(forward(a, w[j], j), j)
print('a:', a)

y_pred =acti2(a)
print('y_pred:', y_pred)

#Loss calculation
loss = (y - y_pred).abs().sum()
print('loss:', loss.item())

loss.backward()
optimizer.step()

``````

w: tensor([ 0.6614, 0.2669, 0.0617, 0.6213, -0.4519, -0.1661, -1.5228, 0.3817, -1.0276], requires_grad=True)
a: tensor([[ 0.5961, 1.1220, 1.6414, 0.3710],
[ 0.0363, -0.7368, -2.2987, 0.3278]])
y_pred: tensor([2.9206e-06, 1.0000e+00, 1.0000e+00, 1.0000e+00])
loss: 1.0000028610229492

If you create a new tensor, you will detach the inputs from the computation graph.
The creation of your `matrix` tensor might be the issue here.
Could you try to rewrite the creation using `torch.cat` instead?

Let me know, if that helps or if we need to dig a bit deeper.

Thank you.
I fixed the function named forward like below.

``````def forward(coordinate, w, layer_no):#layer_no from 0 to '(the length of eta_n) - 1'
row1 = torch.Tensor([[1, delta_eta[layer_no]]])
row2_col1 = torch.Tensor([- delta_eta[layer_no]])
row2_col2 = w
row2 = torch.cat([row2_col1, row2_col2], dim = 0).reshape(1, -1)
matrix = torch.cat([row1, row2], dim = 0)

return acti1(matrix.mm(coordinate), layer_no)

``````

Also, I fixed

``````w = torch.randn(9, requires_grad = True)
``````

into

``````w = torch.randn((9, 1) requires_grad = True)
``````

in accordance with the change in function named forward. And the error was solved.

However, another error which is ‘RuntimeError: Trying to backward through the graph a second time, but the buffers have already been freed. Specify retain_graph=True when calling backward the first time’ appeared. I reconstructed all codes to avoid the error what I can’t find a solution and the result is the simplified codes above. However the error appeared again. Would you help me again? I have done many googlings, but I couldn’t understand. And specifying `retain_graph = True` resulted in exponentially increasing operation time.