Different speed to value the tensor by index on different Pytorch versions (0.2 and 0.4)

walk2out · June 5, 2019, 5:38pm

Hi, all:

these days, I got a problem: there is different speed to value the tensor by index on different pytorch platforms (Pytorch 0.2 and Pytorch 0.4).

This is test code:

import torch
import time
import numpy as np
def main():
    #tensor_a = (torch.rand(20,100,100)).cuda()
    #tensor_b = torch.rand(20,100,100).cuda()
    #np.save('tensor_a.npy', tensor_a.cpu().numpy())
    #np.save('tensor_b.npy', tensor_b.cpu().numpy())
    tensor_a = torch.from_numpy(np.load('tensor_a.npy', encoding="latin1")).cuda()
    tensor_b = torch.from_numpy(np.load('tensor_b.npy', encoding="latin1")).cuda()
    torch.cuda.synchronize()
    end = time.time()
    for i in range(100):
        tensor_b[tensor_a <= 0.5] = 0
    torch.cuda.synchronize()
    print('run time is:', time.time() - end)
if __name__ == '__main__':
    main()

This is the speed:
Pytorch 0.2: total time is 0.0015s
Pytorch 0.4: total time is 0.17s

What does cause the speed decline on Pytorch 0.4? And how to solve the speed problem on Pytorch 0.4?

Thanks

ptrblck · June 6, 2019, 2:05pm

I assume you are using the shapes [20, 100, 100] for your tests?
Could you also time the code using the latest PyTorch version (1.1.0)?

Also, which GPU, CUDA and cuDNN version are you using?

walk2out · June 6, 2019, 3:54pm

the GPU is Geforce GTX 1080 Ti, cuda is cuda9.0, cuDNN is 7.0; I haven’t tried the pytorch1.1, but both pytorch 0.2 and 0.3 is ok.