Converting code from Tensorflow, getting wrong results (odd implementation of gradient calc)

Zador · May 22, 2020, 9:02am

Hi, I am trying to implement some code from Tensorflow, but I am uncertain on how to implement their method of backpropogation. Here is their code:

       self.IRK_weights = np.reshape(tmp[0:q**2+q], (q+1,q))
       self.x0_tf = tf.placeholder(tf.float32, shape=(None, self.x0.shape[1]))
       self.u0_tf = tf.placeholder(tf.float32, shape=(None, self.u0.shape[1]))
       self.dummy_x0_tf = tf.placeholder(tf.float32, shape=(None, self.q)) #dummy variable for fwd_gradients


def fwd_gradients_0(self, U, x):        
        g = tf.gradients(U, x, grad_ys=self.dummy_x0_tf)[0]
        return tf.gradients(g, self.dummy_x0_tf)[0]


def net_U0(self, x):
        nu = 0.01/np.pi
        U1 = self.neural_net(x, self.weights, self.biases)
        U = U1[:,:-1]
        U_x = self.fwd_gradients_0(U, x)
        U_xx = self.fwd_gradients_0(U_x, x)
        F = -U*U_x + nu*U_xx
        U0 = U1 - self.dt*tf.matmul(F, self.IRK_weights.T)
        return U0

The key part that I am struggling to understand is where they calculate the x derivative of u and use the function fwd_gradients_0, where they also use a dummy variable. Here is how I tried to implement it which is seemingly wrong:

x.requires_grad_(True)
s = nn.Tanh()(self.fc1(x.float()))
s = nn.Tanh()(self.fc2(s))
s = nn.Tanh()(self.fc3(s))
u_1 = self.fc4(s)

u = u_1[:, :-1]
u_x = autograd.grad(u, x, torch.ones(u.shape).to(device), create_graph=True)[0]
u_xx = autograd.grad(u_x, x, torch.ones(u_x.shape).to(device), create_graph=True)[0]
f = - u * u_x + (0.01/np.pi)*u_xx
u_0 = u_1 - dt * torch.mm(f.double(), IRK_weights.T.double())

Any help would be greatly appreciated to help me understand/solve this!

Zador · May 22, 2020, 10:21am

Can’t edit but following what they did I in fact solved the problem (sorry to those who tried).

def fwd_gradients_0(self, U, x):
        z = torch.ones(U.shape).to(device).requires_grad_(True)
        g = autograd.grad(U, x, grad_outputs=z, create_graph=True)[0]
        return autograd.grad(g, z, grad_outputs=torch.ones(g.shape).to(device), create_graph=True)[0]

Can’t say I understand this though. What does taking the derivative w.r.t. the previous grad_output function z mean?

demacdo · October 3, 2020, 2:07am

Hey, thanks for posting your solution to this, works for me. Did you ever figure out why they take the gradients that way? I’m also trying to figure out the discrete PINNs code.

Zador · October 11, 2020, 4:53pm

No never. Let me know if you ever do though!

sunny1 · March 20, 2021, 1:46am

Hi, Did you guys find out to code for the derivative part? I am also trying to solve PINN problem.