# Optimizing loops

Hello,
I wander if someone can help me to find a better way to do the following things (ie. I’m not a python geek)
Let say that in a forward method of my model I get a batch of image of size N,Cin,H,W, then I make a treatment that produce a tensor of dim; N, Cin, K, H, W let us call it ‘x’
with K=1+J*L+L**2 J(J-1)/2 for some J and L values.

Now, I compute some means values on the two last dims (H,W)

``````meanCoeff= torch.mean(x,axis=(3,4))
``````

Then, I proceed like that

``````xnew=torch.zeros_like(x)
#lvl 0
xnew[:,:,0,:,:] = x[:,:,0,:,:]
#lvl 1
for j1 in range(0,J):
for t1 in range(0,L):
xnew[:,:,i,:,:] = x[:,:,i,:,:]-meanCoeffs[:,:,i,None,None]
#lvl 2
for j1 in range(0,J-1):
for t1 in range(0,L):
for j2 in range(j1+1,J):
for t2 in range(0,L):
xnew[:,:,i12,:,:] =x[:,:,i12,:,:]-meanCoeffs[:,:,i1,None,None]
``````

where in case of J=2, L=4 adict is a dictionary equals to

``````adict={(-1,): 0,
(0, 0): 1,
(0, 1): 2,
(0, 2): 3,
(0, 3): 4,
(1, 0): 5,
(1, 1): 6,
(1, 2): 7,
(1, 3): 8,
(0, 0, 1, 0): 9,
(0, 0, 1, 1): 10,
(0, 0, 1, 2): 11,
(0, 0, 1, 3): 12,
(0, 1, 1, 0): 13,
(0, 1, 1, 1): 14,
(0, 1, 1, 2): 15,
(0, 1, 1, 3): 16,
(0, 2, 1, 0): 17,
(0, 2, 1, 1): 18,
(0, 2, 1, 2): 19,
(0, 2, 1, 3): 20,
(0, 3, 1, 0): 21,
(0, 3, 1, 1): 22,
(0, 3, 1, 2): 23,
(0, 3, 1, 3): 24}
``````

My question is : is there a better way to compute xnew (eg. vectorization) especially on cuda device (of course)

Thanks