I have experimented with some example tensors of reasonably large size, and fetching the min and max values from these tensors seems suspiciously fast. Are these operations O(n) or does Pytorch perhaps keep track of them behind the scenes for quick retrieval? Of course the time taken will be system dependent, as well as GPU dependent. The below example is running with tensors on a CPU.

import torch
import time
large_tensor = torch.randn(100,500,10000)
start = time.time()
print(large_tensor.min())
print(large_tensor.max())
end = time.time() - start
print(f'Time taken to find both min and max: {end} seconds')

Output: Time taken to find both min and max: 0.6569840908050537 seconds
Thank you in advance for considering my question!

It probably depends on the problem size, sorting is a much more expensive operation than min/max, and you will feel that with large inputs.
On the other hand, for small problems, you would probably not overthink optimizing this, anyways.
You might also see if the advanced jit fusers optimize this, I think they are capable of this, but I don’t know if they do it.

If you don’t need gradients, there also is the relatively new, undocumented torch._aminmax.