I tested to create cost volume(stereo depth estimation)
then I tried to evaluate 2 method below
what makes time that differ?
start_full_time = time.time()
B, C, H, W = refimg_fea.shape
cost = refimg_fea.new_zeros([B, 2*C, self.maxdisp//4, H, W], requires_grad=False)
print('time = %.4f [s]' %((time.time() - start_full_time)))
print(cost.device)
time = 0.0000 [s]
cuda:0
start_full_time = time.time()
B, C, H, W = refimg_fea.shape
cost = torch.FloatTensor(B, C*2, self.maxdisp//4, H, W).cuda()
print('time = %.4f [s]' %((time.time() - start_full_time)))
print(cost.device)
time = 0.1875 [s]
cuda:0
my guess is first code is createing tensor in cuda
secound code is creating tensor in cpu and move to cuda
so moving to cuda consuming operation time
if yes, is there any way to directly create tensor in cuda?