How to construct the Gram matrix of a gaussian RBF kernel

rzhang63 · April 23, 2020, 10:04pm

I’m implementing a custom loss function, which has a term that involves the gram matrix of a Gaussian RBF kernel.

Say, for each training iteration, I get a mini-batch (batch size 128) of predicted probabilities for K=5 classes. So the predicted probability tensor has shape=(128,5). Now I wish to compute the Gram matrix (128 by 128) of the Gaussian RBF kernel exp(-||p-q||^2) where p and q are the predicted probability vectors.

I don’t know if there’s a way of doing this without looping through all the 128x128 possible pairs of p and q, and yet preserves the autograd compatibility so I can use it as part of the loss function.

Could you please help me with some code example? Thank you in advance!

Best

ptrblck · April 24, 2020, 6:59am

I’m not sure, but wouldn’t torch.mm(mat, mat.t()) calculate the Gram matrix?

PS: as you can clearly see I’m not an expert in this topic, so tagging certain people might demotivate others to answer in your thread.

rzhang63 · April 24, 2020, 12:23pm

Thank you very much for your reply!

If mat is the predicted probability matrix with shape (128,5), then torch.mm(mat,mat.t()) gives me the 128 by 128 matrix that contains all the inner products between pairs of rows p,q in mat.

But what I’m hoping is to compute a more general function k(p,q) between all pairs of rows p,q in mat and store it in a 128 by 128 matrix. So torch.mm(mat,mat.t()) can be seen as a simple case of this where the function k(p,q) is just the inner product <p,q>.

ptrblck · April 25, 2020, 12:27am

OK, I see. Do you have a specific function in mind for k?

rzhang63 · April 25, 2020, 3:16am

Yes, for example, k(p,q)=exp(-||p-q||) where the norm ||p-q|| is the L1 norm.

ptrblck · April 25, 2020, 5:15am

You could add a dummy dimension and use broadcasting for this use case:

a = torch.randn(128, 2)
b = torch.randn(128, 2)
res = torch.norm(a.unsqueeze(1)-b, dim=2, p=1)

res_manual = []
for a_ in a:
    for b_ in b:
        res_manual.append(torch.norm(a_-b_, dim=0, p=1))
res_manual = torch.stack(res_manual)
res_manual = res_manual.view(128, 128)

print((res - res_manual).abs().max())
> tensor(0.)

rzhang63 · April 25, 2020, 12:12pm

Thank you so much for the code!

Just to check, both methods in your code example are compatible with autograd right? coz I want to use the matrix as part of my loss function.

ptrblck · April 25, 2020, 10:40pm

Yes, Autograd will be able to track these operations.