I have the following pytorch code that define a new matrix operation on two matrix points and primitives
def operations(points, primitives): """ points shape: (batch size, number_of_points, 3) primitives shape: (batch_size, number_of_primitives,7) """ gradient = torch.zeros(batch_size,number_of_points,number_of_primitives) for i in range(batch_size): temp_points = points[i,:,:] temp_primitives= primitives[i,:,:] temp = torch.zeros(number_of_points,number_of_primitives) for k in range(number_of_points): for j in range(number_of_primitives): temp[k,j] = torch.norm(temp_points[k,:]*temp_primitives[j,:3]+temp_primitives[j,3:6]) gradient[i,:,:] = temp return gradient
Is there any method to parallelize such code to speed up?Thanks!
The above serialized code is implemented by myself, which is utilized in a deep-learning work. Is there any method to parallelize it? Everytime I run my code, the data loader of pytorch will throw an error
RuntimeError: DataLoader worker (pid 255034) is killed by signal: Killed. even though I set the number of worker is zero. Thanks!