"One of the differentiated Tensors appears to not have been used" - Can't figure out why

Hello, Im trying to implement a Differential Dynamic Programming algorithm using pytorch and I keep getting:

RuntimeError: One of the differentiated Tensors appears to not have been used in the graph. Set allow_unused=True if this is the desired behavior.

I’ve seen numerous posts about this problem but I want able to grasp why it is happening in my code:

def optimize(
x0: torch.Tensor,
x_goal: torch.Tensor,
N: int = None,
U0: torch.Tensor = None,
full_output: bool = False,
if not N and U0 is None:
raise ValueError(“You must provide either trajectory length N or initial control sequence U0”)
if U0 is not None:
N = len(U0)
U = U0.clone().detach().requires_grad_(True)
assert N > 0
U = torch.rand(N, self.Nu, requires_grad=True) * 2 - 1 # Uniform between -1 and 1
def J(X, U):
total_cost = sum(self.inst_cost(X[i], U[i], x_goal) for i in range(len(U)))
total_cost += self.terminal_cost(X[-1], x_goal)
return total_cost

    X = [x0.clone().detach().requires_grad_(True)]
    for i in range(N):
        X.append(self.dynamics(X[-1], U[i]))
    X = torch.stack(X)

    last_cost = J(X, U)

    if full_output:
        X_hist = [X.clone().detach()]
        U_hist = [U.clone().detach()]
        cost_hist = [last_cost.item()]

    for iteration in range(self.max_iters):
        Vx = torch.autograd.grad(self.terminal_cost(X[-1], x_goal), X[-1], retain_graph=True)[0]

The main that calls this optimization function is:

def f(x, u, constrain=True):
theta = torch.atan2(x[0], x[1])
theta_dot = x[2].detach()
torque = torch.tanh(u[0]) if constrain else u[0]
theta_dot_dot = -3 * G * torch.sin(theta + torch.pi) / (2 * L) + 3 * torque / (M * L**2)
theta += theta_dot * dt
theta_dot += theta_dot_dot * dt
return torch.stack([torch.sin(theta), torch.cos(theta), theta_dot])

def h(x, x_goal):
error = x - x_goal
Qt = 100 * torch.eye(3, dtype=torch.float32)
result = error.view(1, -1) @ Qt @ error.view(-1, 1)
return result.squeeze()

N = 100 # trajectory points
Nx = 3 # state dimension
Nu = 1 # control dimensions

x0 = torch.tensor([torch.sin(torch.tensor(torch.pi)), torch.cos(torch.tensor(torch.pi)), 0.0], requires_grad=True)

x_goal = torch.tensor([torch.sin(torch.tensor(0.0)), torch.cos(torch.tensor(0.0)), 0.0], requires_grad=True)
ddp = DDPOptimizer(Nx, Nu, f, g, h)
X, U, X_hist, U_hist, J_hist = ddp.optimize(x0, x_goal, N=N, full_output=True)

This implementatin is based on the following blog and code which mainly uses sympy and numy

DDP blog

Ill be happy for your help!

Your code is not executable as a few methods are missing. Could you create a minimal and executable code snippet reproducing the issue, please?