Hi, all:
these days, I got a problem: there is different speed to value the tensor by index on different pytorch platforms (Pytorch 0.2 and Pytorch 0.4).
This is test code:
import torch
import time
import numpy as np
def main():
#tensor_a = (torch.rand(20,100,100)).cuda()
#tensor_b = torch.rand(20,100,100).cuda()
#np.save('tensor_a.npy', tensor_a.cpu().numpy())
#np.save('tensor_b.npy', tensor_b.cpu().numpy())
tensor_a = torch.from_numpy(np.load('tensor_a.npy', encoding="latin1")).cuda()
tensor_b = torch.from_numpy(np.load('tensor_b.npy', encoding="latin1")).cuda()
torch.cuda.synchronize()
end = time.time()
for i in range(100):
tensor_b[tensor_a <= 0.5] = 0
torch.cuda.synchronize()
print('run time is:', time.time() - end)
if __name__ == '__main__':
main()
This is the speed:
Pytorch 0.2: total time is 0.0015s
Pytorch 0.4: total time is 0.17s
What does cause the speed decline on Pytorch 0.4? And how to solve the speed problem on Pytorch 0.4?
Thanks