Hi, I am trying to implement some code from Tensorflow, but I am uncertain on how to implement their method of backpropogation. Here is their code:

```
self.IRK_weights = np.reshape(tmp[0:q**2+q], (q+1,q))
self.x0_tf = tf.placeholder(tf.float32, shape=(None, self.x0.shape[1]))
self.u0_tf = tf.placeholder(tf.float32, shape=(None, self.u0.shape[1]))
self.dummy_x0_tf = tf.placeholder(tf.float32, shape=(None, self.q)) #dummy variable for fwd_gradients
def fwd_gradients_0(self, U, x):
g = tf.gradients(U, x, grad_ys=self.dummy_x0_tf)[0]
return tf.gradients(g, self.dummy_x0_tf)[0]
def net_U0(self, x):
nu = 0.01/np.pi
U1 = self.neural_net(x, self.weights, self.biases)
U = U1[:,:-1]
U_x = self.fwd_gradients_0(U, x)
U_xx = self.fwd_gradients_0(U_x, x)
F = -U*U_x + nu*U_xx
U0 = U1 - self.dt*tf.matmul(F, self.IRK_weights.T)
return U0
```

The key part that I am struggling to understand is where they calculate the x derivative of u and use the function fwd_gradients_0, where they also use a dummy variable. Here is how I tried to implement it which is seemingly wrong:

```
x.requires_grad_(True)
s = nn.Tanh()(self.fc1(x.float()))
s = nn.Tanh()(self.fc2(s))
s = nn.Tanh()(self.fc3(s))
u_1 = self.fc4(s)
u = u_1[:, :-1]
u_x = autograd.grad(u, x, torch.ones(u.shape).to(device), create_graph=True)[0]
u_xx = autograd.grad(u_x, x, torch.ones(u_x.shape).to(device), create_graph=True)[0]
f = - u * u_x + (0.01/np.pi)*u_xx
u_0 = u_1 - dt * torch.mm(f.double(), IRK_weights.T.double())
```

Any help would be greatly appreciated to help me understand/solve this!