hello all I’d like to write the function below in pytorch in order to fully utilize my computer’s gpu such that I can speed up the calculation. Currently this step is really slowing my algorithm down, any help is greatly appreciated. Some notes, the function essentially creates a gaussian kernel with a predefined window size and sigma which is created with respect to the euclidean distance between the center of the input feature map and the current point of interest.

```
def pseudo_colliculus_map (neural_map):
dimensions = neural_map.size()
mapped = torch.zeros(dimensions[2], dimensions[3]).cuda()
localizer = torch.Tensor([(dimensions[2]/2),(dimensions[3]/2)]).type(torch.int64).cuda()
for i in range(0, dimensions[2]):
for j in range(0, dimensions[3]):
euc = math.sqrt((localizer[0]-i)**2 + (localizer[1]-j)**2)
sigma = 0.06*(euc) + 0.4
size = torch.Tensor(np.array(sigma*3)).type(torch.int64).cuda()
kernel = np.fromfunction(lambda x, y: (1/(2*math.pi*sigma**2)) * math.e ** ((-1*((x-(size- 1)/2)**2+(y-(size-1)/2)**2))/(2*sigma**2)), (size, size))
kernel = torch.div( kernel, torch.sum(kernel))
weights = kernel.type(torch.FloatTensor).cuda()
weights = weights.view(1,1,size, size)
pad = math.ceil((size)/2)
mapping = F.conv2d(neural_map, weights, stride =1, padding = pad)
val = mapping[0, 0, i,j]
mapped[i,j] = val
return mapped
```