Self daptive weight in PINN

Ali_geo · February 13, 2023, 2:07pm

Dear community,

I want to develop a Physics informed neural network model in Pytorch. My network should be trained based on two losses: boundary condition (BC) and partial derivative equation (PDE). I am adding these two losses but the problem is that the BC is controlling the main loss, like the following figure:

How can I balance it. This is my script:

class FCN(nn.Module):
    ##Neural Network
    def __init__(self,layers):
        super().__init__() #call __init__ from parent class 
        self.activation = nn.Tanh()
        self.loss_function = nn.MSELoss(reduction ='mean')
        'Initialise neural network as a list using nn.Modulelist'  
        self.linears = nn.ModuleList([nn.Linear(layers[i], layers[i+1]) for i in range(len(layers)-1)]) 
        self.iter = 0
        'Xavier Normal Initialization'
        for i in range(len(layers)-1):         
            nn.init.xavier_normal_(self.linears[i].weight.data, gain=1.0)            
            nn.init.zeros_(self.linears[i].bias.data)   
    'foward pass'
    def forward(self,x):
        if torch.is_tensor(x) != True:         
            x = torch.from_numpy(x)                
        a = x.float()
        for i in range(len(layers)-2):  
            z = self.linears[i](a)              
            a = self.activation(z)    
        a = self.linears[-1](a)
        return a
    'Loss Functions'
    #Loss BC
    def lossBC(self, x_BC, y_BC):
        loss_BC = self.loss_function(self.forward(x_BC),y_BC)
        return loss_BC
    #Loss PDE
    def lossPDE(self,x_PDE):
        g = x_PDE.clone()
        g.requires_grad = True # Enable differentiation
        f = self.forward(g)
        f_x_t = torch.autograd.grad(f,g,torch.ones([g.shape[0],1]).to(device),retain_graph=True, create_graph=True)[0] #first derivative
        f_xx_tt = torch.autograd.grad(f_x_t,g,torch.ones(g.shape).to(device), create_graph=True)[0]#second derivative
        f_t = f_x_t[:,[1]]
        f_xx = f_xx_tt[:,[0]]
        f = f_t - alpha * f_xx
        return self.loss_function(f,f_hat)
      
    def loss(self,x_BC,y_BC,x_PDE):
        loss_bc = self.lossBC(x_BC.float(),y_BC.float())
        loss_pde = self.lossPDE(x_PDE.float())
        return loss_bc.float() + loss_pde.float()

I do appreciate any help in adavnce.

Jamie_Donnelly · February 13, 2023, 2:29pm

It’s hard to say given this alone, but since the high BC loss is the only ‘test’ loss in the plot, is this just a case of poor generalisation?

Based on my own work, I found that when your total loss is a linear combination of different sub-losses, you should try and ensure that the sub-losses are on a similar scale. If an average value for sub-loss 1 is 0.01 but for sub-loss 2 it’s 10, the total loss is dominated by the first term, and then when training, it can be expected that the weights will be tuned in the direction that leads to the biggest reduction in the overall loss, i.e., the weights will be tuned to reduce sub-loss 1, not sub-loss 2, because it dominates the loss.

Cleaning your code up (things like self.loss_function = ... in __init__ is not good practice) and creating a Loss class for your problem would also help, see: PINNs/losses.py at main · jamiedonnelly/PINNs · GitHub.

Ali_geo · February 13, 2023, 2:41pm

Dear Jamie Donnelly,

Thanks for dedicating time to my issue. As you mentioned the relation between two sub-losses is more complicated than a simple plus operation!
I went through the git page you posted but still I did not get how I can manage these two losses. Sorry for my confusion.

Jamie_Donnelly · February 13, 2023, 2:50pm

I would suggest making a class like,

class Ploss():
    def __init__(self):
        self.mse = nn.MSELoss()
        pass

    def _loss_BC(self,pred,true,**kwargs):
        ...
        return self.mse(pred,true)

    def _loss_PDE(self,pred,true,**kwargs):
        ...

    def __call__(self,pred,true):
        return self._loss_BC(...) + self._loss_PDE(...)

# Instantiate
criterion = PLoss()

# Calculate loss
loss = criterion(pred,true)

And then from there, during the training forward passes, you can print out the values of _loss_BC and _loss_PDE and look at the behaviour. You can see whether the loss values are changing at all, how much they’re changing by, is one clearly being prioritised over the other, etc…

Ali_geo · February 13, 2023, 2:58pm

Thanks for writing the code for me.
Should I copy the exact loss from my code into the code you have written? I mean how losses are supposed to be calculated? And at the end both losses are added which was the same as what I had before.

Jamie_Donnelly · February 13, 2023, 3:18pm

Just create the methods for your sub-losses in an overall class like the template I showed. So your lossBC function becomes a method which can be accessed through Ploss._loss_BC(), and the same for the PDE. Then you can obtain the total loss in one call to criterion(.,.) and access the sub-losses easily.

In your original code, it’s not clear which are your predicted values and which are your targets, and it wouldn’t run as presented, for instance, self.loss_function(f,f_hat) is returned but f_hat is not included anywhere etc…

Ali_geo · February 13, 2023, 3:32pm

Thanks for your help. f_hat is a tensor of zeros which is used to minimize the error between collocation points of PINN and the numerically calculated values.
Meanwhile, I am using the self.forward of the class for predicting the values and later returning the loss value and if i separate the losses as another class I do not know how to include the method self.forward in it.

Jamie_Donnelly · February 13, 2023, 3:48pm

I think much of the confusion comes down to poor code structure. You should define your model and your loss functions separately, e.g,

class FCN(nn.Module):
    def __init__(self,layers):
        super().__init__() #call __init__ from parent class 
        self.activation = nn.Tanh()
        self.loss_function = nn.MSELoss(reduction ='mean')
        'Initialise neural network as a list using nn.Modulelist'  
        self.linears = nn.ModuleList([nn.Linear(layers[i], layers[i+1]) for i in range(len(layers)-1)]) 
        self.iter = 0
        'Xavier Normal Initialization'
        for i in range(len(layers)-1):         
            nn.init.xavier_normal_(self.linears[i].weight.data, gain=1.0)            
            nn.init.zeros_(self.linears[i].bias.data)   
    'foward pass'
    def forward(self,x):
        if torch.is_tensor(x) != True:         
            x = torch.from_numpy(x)                
        a = x.float()
        for i in range(len(layers)-2):  
            z = self.linears[i](a)              
            a = self.activation(z)    
        a = self.linears[-1](a)
        return a

model = FCN(...)
optim = ...
criterion = Ploss(...)

for epoch in epochs:
    for x,y in batch:
        # zero out gradients
        optim.zero_grad()

        # fwd pass to calculate output
        output = model(x)

        # calculate loss
        loss = criterion(output,y) # This can be your total physics loss 
        loss.backward()
        
        # update parameters are backprop 
        optim.step()

        """ Or by doing each part separately.
           bc_loss = criterion._loss_BC(output,y)
           pde_loss = criterion._loss_PDE(output) # PDE residual based on pred 
           loss = bc_loss + pde_loss
           loss.backward()
        """

Look at existing PyTorch tutorials for more guidance on how to structure your code, from there you will be able to better determine what’s going wrong with the PINNs loss.

Ali_geo · February 13, 2023, 3:51pm

Thanks a lot for your patience and support.