I have a numpy code which I want to convert into PyTorch. It is related to nms but I do not have enough expertise to write a CUDA kernel for it. Can someone please help me with the conversion. Specifically I was hoping to find a way to remove the for loop here. I think if the for loop is replaced, the operation might not be slow.
def nms(dets, scores, thresh): ''' dets is a numpy array : num_dets, 6 scores ia nump array : num_dets, ''' x1 = dets[:, 0] y1 = dets[:, 1] z1 = dets[:, 2] x2 = dets[:, 3] y2 = dets[:, 4] z2 = dets[:, 5] volume = (x2 - x1 + 1) * (y2 - y1 + 1) * (z2 - z1 + 1) order = scores.argsort()[::-1] # get boxes with more ious first keep =  while order.size > 0: i = order # pick maxmum iou box keep.append(i) xx1 = np.maximum(x1[i], x1[order[1:]]) yy1 = np.maximum(y1[i], y1[order[1:]]) zz1 = np.maximum(z1[i], z1[order[1:]]) xx2 = np.minimum(x2[i], x2[order[1:]]) yy2 = np.minimum(y2[i], y2[order[1:]]) zz2 = np.minimum(z2[i], z2[order[1:]]) w = np.maximum(0.0, xx2 - xx1 + 1) # maximum width h = np.maximum(0.0, yy2 - yy1 + 1) # maxiumum height l = np.maximum(0.0, zz2 - zz1 + 1) # maxiumum length inter = w * h * l ovr = inter / (volume[i] + volume[order[1:]] - inter) inds = np.where(ovr <= thresh) order = order[inds + 1] return keep