Hello guys.
I’m measuring the speed of a model using this code
def benchmark_speed(model, input_shape, precision: str = ‘fp32’, nwarmup=10, nruns=500):
input_data = torch.randn(input_shape)
input_data = input_data.to(‘cuda’)
if precision == ‘fp16’:
input_data = input_data.half()print("Warm up ...") with torch.no_grad(): for _ in range(nwarmup): _ = model(input_data) torch.cuda.synchronize() print("Start timing ...") timings = [] with torch.no_grad(): for i in range(1, nruns + 1): start_time = time.time() _ = model(input_data) torch.cuda.synchronize() end_time = time.time() timings.append(end_time - start_time) if i % 100 == 0: print('Iteration %d/%d, ave batch time %.2f ms' % (i, nruns, numpy.mean(timings) * 1000)) print("Input shape:", input_data.size()) print('Average batch time: %.2f ms' % (numpy.mean(timings) * 1000))
The method has a first part with warming and a second part where I measure the time speed of the model resnet101 in this way
model = torch.hub.load(‘pytorch/vision:v0.10.0’, ‘resnet101’, False)
model.cuda().eval()
benchmark_speed(model, (64, 3, 224, 224))trt_model = torch.jit.trace(model, torch.randn(1, 3, 224, 224).cuda())
benchmark_speed(model, input_shape = (64, 3, 224, 224))
I’m getting 75.42 ms and 75.69ms for each, eager model and torchscript model.
I dont see the point why I would want to use TorchScript (apart of using it with torchTensorRT)
PDT: I also tried
trt_model = torch.jit.trace(model, torch.randn(64, 3, 224, 224).cuda())
benchmark_speed(model, input_shape = (64, 3, 224, 224))
but the behavior is similar, except for batch 1, in that case TorchScript seems be a bit (a bit) lightly faster.
Am I doing something wrong? I don’t see the advantages of using TorchScript.
Thank you