I now have a simple feature extraction function. The process of the function involves transferring results calculated on the GPU to the CPU. Subsequently, the computed results need to be combined with an additional ID to form a collection of IDs (approximately 256k IDs). Then, I will extract corresponding data from the “feat” data. However, during my testing, I found that the computation time of this function is highly unstable. In my printout, the computation time varies from 0.005s to 0.026s. I would like to understand the reasons behind this issue and explore possible optimizations to stabilize this computation time.
def featMerge(self,nodeids):
toCPUTime = time.time()
nodeids = nodeids.to(device='cpu')
print("to CPU time {}s".format(time.time()-toCPUTime))
catTime = time.time()
temp_merge_id[1:] = nodeids
print("cat time {}s".format(time.time()-catTime))
featTime = time.time()
test = self.feats[temp_merge_id]
print("feat merge {}s".format(time.time()-featTime))
print("all merge {}s".format(time.time()-toCPUTime))
return test