I have to refactor a old physics informed neural network code for further experiments. The old code is in notebook form and has lot of explicit computational expressions. As you can see in the new implementation, it is much more flexible which is necessary for our work right now. The following are two snippets of a function that I found out to be the culprit. I have replaced all the old code with new code incrementally and test out by training the model and comparing metrics and visualizing the functions modeled by the PINN. All of my new code changes have no unintended effect and reproduce the results of old notebook with error differences in the order of 1e-5.
Only when I use the new net_f function as defined in the training the training gets slow by a large measure. For instance with old notebook’s net_f by 1900 iterations, the error reached 1e-03 range where as using new net_f definition below the error is still in 1e-1 range.
Both implementation use the LBFGS optimizer with following config
# optimizers: using the same settings
self.optimizer = torch.optim.LBFGS(
self.dnn.parameters(),
lr=1.0,
max_iter=50000,
max_eval=50000,
history_size=50,
tolerance_grad=1e-5,
tolerance_change=1.0 * np.finfo(float).eps,
line_search_fn="strong_wolfe" # can be "strong_wolfe"
)
Note: The names is just string uvpsab
def cleave_and_name(self, input_tensor):
return OrderedDict([(name, input_tensor[:, i:j])
for name, i, j in zip(self.names,
range(NCOMP),
range(1, NCOMP+1))])
def name_the_tensors(self, tensors):
return OrderedDict([(name, i) for name, i in zip(self.names, tensors)])
def net_f(self, x, t):
""" The pytorch autograd version of calculating residual """
y = self.net_u(x, t)
F = self.cleave_and_name(y).values()
def compute_grad(diff, wrt):
return torch.autograd.grad(diff, wrt,
grad_outputs=torch.ones_like(diff),
retain_graph=True,
create_graph=True)[0]
Ft = [compute_grad(i, t) for i in F]
Fx = [compute_grad(i, x) for i in F]
Fxx = [compute_grad(i, x) for i in Fx]
def compute_output(f, f_t, f_xx, c1, c2, c1_sign):
return f_t + (c1_sign * c1 * f_xx) + ((-1 * c1_sign) * c2 * f)
N_F = len(F)
SQ = sum([ i**2 for i in F ])
C1 = [0.5] * N_F
C2 = [SQ] * N_F
C1_SIGN = [pow(-1, i) for i in range(1, N_F+1)]
F_output = [compute_output(*i)
for i in zip(F, Ft, Fxx, C1, C2, C1_SIGN)]
return self.name_the_tensors(F_output)
The old implementation is as follows.
def net_f(self, x, t):
""" The pytorch autograd version of calculating residual """
y = self.net_u(x, t)
u = y[:, 0:1]
v = y[:, 1:2]
p = y[:, 2:3]
s = y[:, 3:4]
a = y[:, 4:5]
b = y[:, 5:6]
u_t = torch.autograd.grad(u, t, grad_outputs=torch.ones_like(u), retain_graph=True, create_graph=True)[0]
v_t = torch.autograd.grad(v, t, grad_outputs=torch.ones_like(v), retain_graph=True, create_graph=True)[0]
p_t = torch.autograd.grad(p, t, grad_outputs=torch.ones_like(p), retain_graph=True, create_graph=True)[0]
s_t = torch.autograd.grad(s, t, grad_outputs=torch.ones_like(s), retain_graph=True, create_graph=True)[0]
a_t = torch.autograd.grad(a, t, grad_outputs=torch.ones_like(a), retain_graph=True, create_graph=True)[0]
b_t = torch.autograd.grad(b, t, grad_outputs=torch.ones_like(b), retain_graph=True, create_graph=True)[0]
u_x = torch.autograd.grad(u, x, grad_outputs=torch.ones_like(u), retain_graph=True, create_graph=True)[0]
v_x = torch.autograd.grad(v, x, grad_outputs=torch.ones_like(v), retain_graph=True, create_graph=True)[0]
p_x = torch.autograd.grad(p, x, grad_outputs=torch.ones_like(p), retain_graph=True, create_graph=True)[0]
s_x = torch.autograd.grad(s, x, grad_outputs=torch.ones_like(s), retain_graph=True, create_graph=True)[0]
a_x = torch.autograd.grad(a, x, grad_outputs=torch.ones_like(a), retain_graph=True, create_graph=True)[0]
b_x = torch.autograd.grad(b, x, grad_outputs=torch.ones_like(b), retain_graph=True, create_graph=True)[0]
u_xx = torch.autograd.grad(u_x, x, grad_outputs=torch.ones_like(u_x), retain_graph=True, create_graph=True)[0]
v_xx = torch.autograd.grad(v_x, x, grad_outputs=torch.ones_like(v_x), retain_graph=True, create_graph=True)[0]
p_xx = torch.autograd.grad(p_x, x, grad_outputs=torch.ones_like(p_x), retain_graph=True, create_graph=True)[0]
s_xx = torch.autograd.grad(s_x, x, grad_outputs=torch.ones_like(s_x), retain_graph=True, create_graph=True)[0]
a_xx = torch.autograd.grad(a_x, x, grad_outputs=torch.ones_like(a_x), retain_graph=True, create_graph=True)[0]
b_xx = torch.autograd.grad(b_x, x, grad_outputs=torch.ones_like(b_x), retain_graph=True, create_graph=True)[0]
SQ = u**2 + v**2 + p**2 + s**2 + a**2 + b**2
f_u = v_t - 0.5*u_xx + SQ*u
f_v = u_t + 0.5*v_xx - SQ*v
f_p = s_t - 0.5*p_xx + SQ*p
f_s = p_t + 0.5*s_xx - SQ*s
f_a = b_t - 0.5*a_xx + SQ*a
f_b = a_t + 0.5*b_xx - SQ*b
return f_u, f_v, f_p, f_s, f_a, f_b