RBF Kernel Module Implement in Pytorch

Hello all

I am writing code for implementing learnable RBF kernel in Pytorch, where both center and variance parameters can be learned through back-propagtion with SGD;

Equation: rst = torch.exp(-((input - center)^2).sum() / variance^2), where rst is scalar; For example, Input_tensor [batch, 128], output_tensor [batch, 1] for one RBF kernel;

My question is: If I use 256 RBF kernel, how could I go from [batch, feature_dimensions] to [batch, RBF_dimensions], i.e. [32, 128] to [32, 256] efficiently ?

I can really creat 256 RBF kernel instance and apply them one by one. However, this could be low efficiency.