# How to construct the Gram matrix of a gaussian RBF kernel

Hi @ptrblck

I’m implementing a custom loss function, which has a term that involves the gram matrix of a Gaussian RBF kernel.

Say, for each training iteration, I get a mini-batch (batch size 128) of predicted probabilities for K=5 classes. So the predicted probability tensor has shape=(128,5). Now I wish to compute the Gram matrix (128 by 128) of the Gaussian RBF kernel exp(-||p-q||^2) where p and q are the predicted probability vectors.

I don’t know if there’s a way of doing this without looping through all the 128x128 possible pairs of p and q, and yet preserves the autograd compatibility so I can use it as part of the loss function.

Best

1 Like

I’m not sure, but wouldn’t `torch.mm(mat, mat.t())` calculate the Gram matrix?

PS: as you can clearly see I’m not an expert in this topic, so tagging certain people might demotivate others to answer in your thread. If `mat` is the predicted probability matrix with shape (128,5), then `torch.mm(mat,mat.t())` gives me the 128 by 128 matrix that contains all the inner products between pairs of rows `p,q` in `mat`.

But what I’m hoping is to compute a more general function `k(p,q)` between all pairs of rows `p,q` in `mat` and store it in a 128 by 128 matrix. So `torch.mm(mat,mat.t())` can be seen as a simple case of this where the function `k(p,q)` is just the inner product `<p,q>`.

OK, I see. Do you have a specific function in mind for `k`?

Yes, for example, `k(p,q)=exp(-||p-q||)` where the norm `||p-q||` is the L1 norm.

You could add a dummy dimension and use broadcasting for this use case:

``````a = torch.randn(128, 2)
b = torch.randn(128, 2)
res = torch.norm(a.unsqueeze(1)-b, dim=2, p=1)

res_manual = []
for a_ in a:
for b_ in b:
res_manual.append(torch.norm(a_-b_, dim=0, p=1))
res_manual = torch.stack(res_manual)
res_manual = res_manual.view(128, 128)

print((res - res_manual).abs().max())
> tensor(0.)
``````

Thank you so much for the code!

Just to check, both methods in your code example are compatible with autograd right? coz I want to use the matrix as part of my loss function.

Yes, Autograd will be able to track these operations.

1 Like