Nan value for loss

I have a convNet:

class convNet(nn.Module):
    def __init__(self):
        super(convNet, self).__init__()
        #defining layers in convnet
        #input size=1*657*1625
        self.conv1 = nn.Conv2d(1,8, kernel_size=3,stride=1,padding=1)
        self.conv2 = nn.Conv2d(8,16, kernel_size=3,stride=1,padding=1)
        self.pconv1= nn.Conv2d(16,16, kernel_size=(3,3),stride=1,padding=(1,1))
        self.pconv2= nn.Conv2d(16,16, kernel_size=(3,7),stride=1,padding=(1,3))
        self.pconv3= nn.Conv2d(16,16, kernel_size=(7,3),stride=1,padding=(3,1))

        self.conv3 = nn.Conv2d(16,8,kernel_size=3,stride=1,padding=1)
        self.conv4 = nn.Conv2d(8,1,kernel_size=3,stride=1,padding=1)      
    def forward(self, x):
        x = nnFunctions.leaky_relu(self.conv1(x))
        x = nnFunctions.leaky_relu(self.conv2(x))
        x = nnFunctions.leaky_relu(self.pconv1(x))+nnFunctions.leaky_relu(self.pconv2(x))+nnFunctions.leaky_relu(self.pconv3(x))
        x = nnFunctions.leaky_relu(self.conv3(x))
        x = nnFunctions.leaky_relu(self.conv4(x))
        return x

L1Loss function:

def L1Loss(outputs,targets):
    return Variable.abs(outputs-targets).sum()

When I train the above CNN without the following line in the forward :

x = nnFunctions.leaky_relu(self.pconv1(x))+nnFunctions.leaky_relu(self.pconv2(x))+nnFunctions.leaky_relu(self.pconv3(x))

I get some values for loss but if I add the above line in the net forward I get nan value for loss.
Can someone explain.

maybe your learning rate is too high after the additional layers you added? (your network is not scale-invariant layer-wise). Either adjusting the learning rate or adding BatchNorm will help.

Why is the loss nan? If the learning rate is too high then the loss should be high not nan right?