m = 2000
n = 2500
ell = 10
a = torch.randn(m*n).cuda()
v = torch.randn(ell).cuda()
ske = torch.zeros(ell, m*n).cuda()
st = time.time()
ske = torch.ger(v, a)
ed = time.time()
print(‘ger’,ed - st)
b = torch.randn(m*n).cuda()
st = time.time()
ske = torch.ger(v, b)
ed = time.time()
print(‘ger’,ed - st)
The first ger runs 0.1s while second one runs 0.0006s. What happened on this two gers? Similar problems also arise in torch.dot(), the running time differs several tens of times.