Edit: I have created an issue on the git.
In a previous thread I was confused because I thought my GPU was bad. Turns out it’s something else.
On a GTX1080 on Ubuntu18.04, Resnet50 takes 12ms on average for a forward pass. I used this script:
import numpy as np
import torch
from timeit import default_timer as timer
from torchvision.models import resnet50
def main():
# Define model and input data
resnet = resnet50().cuda()
x = torch.from_numpy(np.random.rand(1, 3, 224, 224).astype(np.float32)).cuda() # Entire network
# x = torch.from_numpy(np.random.rand(1, 64, 32, 32).astype(np.float32)).cuda() # Stub alone
# The first pass is always slower, so run it once
resnet.forward(x)
# Measure elapsed time
passes = 20
total_time = 0
for _ in range(passes):
start = timer()
resnet.forward(x)
delta = timer() - start
print('Forward pass: %.3fs' % delta)
total_time += delta
print('Average forward pass: %.3fs' % (total_time / passes))
if __name__ == '__main__':
main()
On my installation of Windows 7, it runs in 58ms. I made sure that:
- The same version of python was used (a fresh 3.6.6 install in both cases)
- The same version of pytorch was used with the same cuda version (0.4.1 and 9.2 respectively)
- Python was installed on the same SSD in both cases (I doubt that matters but you never know)
- I have installed the most recent nvidia drivers on each OS (390 on Ubuntu vs 4.16 on Windows)
Can anyone enlighten me or reproduce this? Otherwise I’ll write an issue on the github.