I have a network running within a ROS node. It produces a large tensor, which I have to copy to CPU and send it to another node
# Called once pinned_tensor = torch.zeros((1,128,512,512)).pin_memory() # Called in Loop pinned_tensor[:] = self.network(input_tensor).cpu()
This is still slow even if i use pin memory. Am i using it incorrectly?