Hi,
I am trying to parallelize the following function:
def assemble(u):
Fint=torch.zeros((ndof,1))
Fext=torch.zeros((ndof,1))
for iel in range(tne):
elnodes=elems[iel,:].long()
xcel=nodes[elnodes,0]
ycel=nodes[elnodes,1]
zcel=nodes[elnodes,2]
dof=torch.tensor([3*elnodes[0],3*elnodes[0]+1,3*elnodes[0]+2,\
3*elnodes[1],3*elnodes[1]+1,3*elnodes[1]+2,\
3*elnodes[2],3*elnodes[2]+1,3*elnodes[2]+2,\
3*elnodes[3],3*elnodes[3]+1,3*elnodes[3]+2,\
3*elnodes[4],3*elnodes[4]+1,3*elnodes[4]+2,\
3*elnodes[5],3*elnodes[5]+1,3*elnodes[5]+2,\
3*elnodes[6],3*elnodes[6]+1,3*elnodes[6]+2,\
3*elnodes[7],3*elnodes[7]+1,3*elnodes[7]+2]).flatten()
u_el=u[dof]
strain,stress = Stress_Strain(xcel,ycel,zcel,u_el)
Fint_e,Fext_e = KEL(xcel,ycel,zcel,strain,stress)
Fint[dof,0] += Fint_e.squeeze(1)
Fext[dof,0] += Fext_e.squeeze(1)
return Fint, Fext
In particular, I am trying to parallelize the for loop in the function. Firstly, is it possible to do that? If yes, any advice/help would be much appreciated.