Because clone is also an edge in the computation graph. By default intermediate nodes are not retaining gradient. In your case the gradient is eventually accumulated to q.grad. If you want q_prime to retain gradient, you need to call q_prime.retain_grad()