`tensor.contiguous()` overhead on already contiguous tensor?

As per my understanding, calling tensor.contiguous() would not have any effect if the tensor is already contiguous.

But the later case is 10x slower as shown below:

import torch
x = torch.randn(128, 256, 3, 3)

assert x[:128].is_contiguous()

# 87 ns ± 0.222 ns per loop (mean ± std. dev. of 7 runs, 10,000,000 loops each)
_ = x.contiguous()

# 920 ns ± 16.3 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)
_ = x[:128].contiguous()

# 73.5 ns ± 1.35 ns per loop (mean ± std. dev. of 7 runs, 10,000,000 loops each)
_ = x.is_contiguous()

# 892 ns ± 16.6 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)
if not x[:128].is_contiguous():
    _ = x[:128].contiguous()

Based on your code you are timing the indexing. Remove it and you will see the same results:

y = x[:128]
%timeit _ = y.contiguous()
1 Like