is there any way to make this faster by using library function?

Please help me with this…

It uses three tensor

input= tensor with size(minibatch,3,Height+51,Width+51)

gradoutput= tensor with size(minibatch,3, Height, Width)

kernel =tensor with size(minibatch,51, Height, Width)

grad=new tensor with size(minibatch,3, Height, Width)

```
for n in range(minibatch):
for i in range(Height):
for j in range(Width):
for k in range(3):
gradcal=[0,0,0]
for c in range(3):
for l in range(51):
gradcal[c]=gradcal[c]+kernel[n,l,i,j]*input[n,c,i+l,j+25]*gradoutput[n,c,i,j]
grad[n,k,i,j]=gradcal[0]+gradcal[1]+gradcal[2]
```