What's wrong with our loss and PyTorch?

Given the samples $\vec{x_i} \in \mathbb{R}^d, i \in [1,…,l]$ where $l$ is the number of training samples, $d$ is the number of input features, the related target values $y_i \in \mathbb{R}$, and the $l \times l$ matrix defined below:

$$ S_{i,j} = e^{-\gamma_S ||\vec{x}_i - \vec{x}_j||^2} = e^{-\gamma_S (\vec{x}_i’ \vec{x}_i -2 \vec{x}_i’ \vec{x}_j + \vec{x}_j’\vec{x}_j)}
$$

where $i \in [1,…,l], j \in [1,…,l]$, and $\gamma_S$ is another hyper-parameter, we would like to use with PyTorch the following custom loss for a regression task:

$$\sum_{i=1}^l \sum_{j=1}^l \sqrt{|p_i-y_i|}\sqrt{|p_j-y_j|} S_{i,j}$$
where $p_i$ is the $i$-th estimation.

Our loss is implemented with this code:

def ourLoss(out, lab):
    global stra, sc
    abserr = torch.abs(out - lab).flatten().float()
    serr = torch.sqrt(abserr)
    bm = stra[sc : sc + out.shape[0], sc : sc + out.shape[0]].float()
    loss = torch.dot(serr, torch.matmul(bm, serr))
    return loss

where ‘stra’ is $S$, sc is a counter used for batch evaluations, and then the Adam optimizer returns a nan loss value…