I am running two different segmentation models. The input have the same size and data type. But the speed of the following code is quite different.
What caused the difference.
Model 1 code
# images torch.float32 cpu False torch.Size([2, 3, 480, 480])
# targets torch.int64 cpu False torch.Size([2, 480, 480])
end2 = time.time()
images = images.to(self.device)
targets = targets.to(self.device)
data_to_device = time.time() - end2
The data_to_device
of this model is 0.0568695068359375.
Model 2 code
print(images.dtype, images.device, images.requires_grad, images.size())
print(target_mask.dtype, target_mask.device,
target_mask.requires_grad, target_mask.size())
end2 = time.time()
images = images.to(device)
target_mask = target_mask.to(device)
data_to_device = time.time() - end2
print('=============', data_to_device)
The output of model 2 from the first iteration is
2019-07-14 18:23:21,835 agfcoo.trainer INFO: Start training
torch.float32 cpu False torch.Size([2, 3, 480, 480])
torch.int64 cpu False torch.Size([2, 480, 480])
============= 0.0012090206146240234
torch.float32 cpu False torch.Size([2, 3, 480, 480])
torch.int64 cpu False torch.Size([2, 480, 480])
============= 0.5653738975524902
torch.float32 cpu False torch.Size([2, 3, 480, 480])
torch.int64 cpu False torch.Size([2, 480, 480])
============= 0.28435635566711426
torch.float32 cpu False torch.Size([2, 3, 480, 480])
torch.int64 cpu False torch.Size([2, 480, 480])
============= 0.26169490814208984
torch.float32 cpu False torch.Size([2, 3, 480, 480])
torch.int64 cpu False torch.Size([2, 480, 480])
============= 0.2824215888977051
torch.float32 cpu False torch.Size([2, 3, 480, 480])
torch.int64 cpu False torch.Size([2, 480, 480])
============= 0.2888615131378174
This computer have one RTX 2060.
The pytorch version is 1.1.
The cuda is 10.0.
What caused the difference?