Hi,
I am trying to solve a mystery.
The following code snippet generates a number of random CUDA torch tensors of various sizes and operates a few operations on these tensors.
The stangness occurs when measuring the elapsed time using Python’s timeit.repeat
function.
Somehow, the first few iterations are a couple of orders of magnitude faster than the last iterations
Any idea ?
I am running this code on an Ubuntu 14.04 machine using python 3.6, pytorch 0.4.1.post2 and a Nvidia Quadro M4000 card.
Here is the code:
import numpy as np
import timeit
import gc
import torch
# size of the test
M = 10000
N = 10000
D = 3
E = 3
device = 'cuda'
# declare tensors
xc = torch.rand(M, D, device=device)
yc = torch.rand(N, D, device=device)
bc = torch.rand(N, E, device=device)
sigmac = torch.rand(1, device=device)
def _squared_distances(x, y):
x_norm = (x ** 2).sum(1).view(-1, 1)
y_norm = (y ** 2).sum(1).view(1, -1)
dist = x_norm + y_norm - 2.0 * torch.mm(x, torch.transpose(y, 0, 1))
return dist
def _func(x, y, s, b):
sq = _squared_distances(x, y)
torch.mm((-sq / (s * s)).exp(), b)
speed_pytorch = np.array(timeit.repeat("_func(xc, yc, sigmac, bc)", setup='gc.enable();', globals=globals(), repeat=100, number=1))
print(speed_pytorch)
And the output:
[0.00896002 0.00023558 0.00021449 0.000208 0.00020671 0.00020615
0.00020571 0.00020591 0.00020429 0.0002054 0.00020259 0.00020241
0.00020344 0.00021511 0.0002045 0.00020349 0.00020346 0.00020304
0.00020419 0.00020382 0.00020363 0.00020238 0.0002032 0.00020386
0.00020311 0.00020472 0.00020299 0.00020274 0.00020187 0.00020358
0.00020355 0.00020294 0.00021372 0.00020253 0.00020109 0.0002006
0.00020271 0.00020158 0.00020437 0.00020162 0.00020179 0.00020287
0.0002021 0.00020064 0.00020138 0.00020067 0.00020173 0.00020194
0.00020214 0.00020111 0.00020141 0.00021044 0.00020267 0.00020206
0.00020099 0.00020164 0.00020256 0.00020333 0.00020216 0.00020247
0.00020264 0.0002027 0.00020325 0.00020169 0.00020424 0.00020281
0.00020306 0.00020255 0.00020205 0.00020288 0.00021285 0.00020318
0.00020483 0.00020746 0.00020353 0.00020233 0.00020117 0.00020162
0.00020206 0.02630114 0.04370697 0.04371016 0.0437286 0.04372212
0.04371193 0.04371529 0.0437236 0.04370545 0.04372237 0.04372816
0.04374273 0.04373613 0.04369926 0.04371182 0.04373186 0.04368425
0.0436838 0.04373274 0.04373524 0.04371948]
Great thanks for anyone that give me some insight on what is happening.