I’m trying to include a physics-informed term in a DeepONet model in pytorch. Basically, the model includes two DNNs, one with a single input, and the other with 3 inputs. The model then computes the dot-product between the outputs of these DNNs. The scalar output of the model is supposed to predict the value of a physical quantity in a certain point in space at a certain instant. This implements
both DNNs (branch and trunk) and the model:
class branchNet(nn.Module):
"""Branch network definition"""
def __init__(self, inDim: int, nnDepth: int, nnWidth: int):
super().__init__()
# Input layer. Resizes input to desired network width
self.inputLayer = nn.Linear(inDim, nnWidth)
# intermediate dense layers. constant dimension
self.MLPstack = nn.ModuleList([nn.Linear(nnWidth, nnWidth) for _ in range(nnDepth - 2)])
# output layer. resizes network intermediate representation to networkOutputDim
self.outputLayer = nn.Linear(nnWidth, networkOutputDim)
def forward(self, x): # forward pass
x = F.relu(self.inputLayer(x))
for l in self.MLPstack:
x = F.relu(l(x))
return self.outputLayer(x)
class trunkNet(nn.Module):
"""Trunk network definition"""
def __init__(self, nnDepth: int, nnWidth: int):
super().__init__()
# Input layer. Resizes input to desired network width.
# Consider trunk network as receiving individual inputs
# for each dimension (x, y, t)
self.xCoord = nn.Linear(1, nnWidth)
self.yCoord = nn.Linear(1, nnWidth)
self.tCoord = nn.Linear(1, nnWidth)
# intermediate dense layers. constant dimension
self.MLPstack = nn.ModuleList([nn.Linear(nnWidth, nnWidth) for _ in range(nnDepth - 2)])
# output layer. resizes network intermediate representation to networkOutputDim
self.outputLayer = nn.Linear(nnWidth, networkOutputDim)
def forward(self, x, y, t): # forward pass
x = F.relu(self.xCoord(x))
y = F.relu(self.yCoord(y))
t = F.relu(self.tCoord(t))
o = x + y + t
for l in self.MLPstack:
o = F.relu(l(o))
return self.outputLayer(o)
class PI_deepONet(nn.Module):
"""Class for physics-informed DeepONet"""
def __init__(self, branch: nn.Module, trunk: nn.Module):
super().__init__()
self.branch = branch
self.trunk = trunk
def forward(self, case: torch.tensor, x: torch.tensor, y: torch.tensor, t: torch.tensor):
# transpose trunk output and multiply both for
# sample-by-sample dot product in diagonal of result
return F.relu(torch.diagonal(
torch.matmul(
self.branch(case),
self.trunk(x, y, t).T
)
).unsqueeze(1))
The physics-informed term I’m want to include is the residual of the PDE, which is calculated by differentiating the model’s output in certain ways. The PDE in question is: p_xx + p_yy - p_tt / (c ^ 2) (‘_xx’ is second derivative in x, p is the output of the model, and c is a constant).
I already tried a bunch of things. Currently, I’ve been testing the gradients with the code below:
branchNetwork = branchNet(nSensor, branchDepth, branchWidth)
trunkNetwork = trunkNet(trunkDepth, trunkWidth)
model = PI_deepONet(branchNetwork, trunkNetwork)
outM = model(branchBatch, xBatch, yBatch, tBatch)
p_x = torch.autograd.grad(
outM, xBatch,
grad_outputs = torch.ones_like(outM),
retain_graph = True,
create_graph = True
)[0]
p_xx = torch.autograd.grad(
p_x, xBatch,
grad_outputs = torch.ones_like(p_x),
retain_graph = True,
create_graph = True
)[0]
But both gradients are entirely zero. With a batch of size 256, xBatch, yBatch and tBatch have shape [256, 1]; the branch network input is [256, 1000]; the model output is [256, 1]; and the gradients have shape [256, 1].
Why are my gradients zero? How do I fix this?