How to speed up pytorch in for loop

As is know,for loop in python is always in low speed.And in my model ,i design a matrix to index,so i use clusters to make one image in different clusters.And use their index like(1,2,3,4) to express it.

It like this and now the only idea i have is to make multi-process to handle it to speed up?Any one has better ideas? Thanks a lot!