Does autocast create copies of tensors on the fly?

Aknw_Fen · December 16, 2024, 11:23am

Ops will autocast within an autocast block - if the Ops support it.

But tensors don’t change type, see example below, so I assume a copy must be created.

So, for example here:

import torch

l  = torch.nn.Linear(10,12)
t = torch.randn((10,1))

with torch.autocast(device_type="cpu"):
  print(l(t.transpose(1,0)).type())
  print(next(l.parameters()).type())

The result of appplying the linear layer l to the tensor t is in half or bhalf precision, but the parameters are all in float32 (default) unless one explicitly sets them to other dtype.

So my simple question is: does it mean the tensors (input, weight and bias) are “casted” for the operations within the layer, but do not change the original tensor and hence they are a new copy created on the fly?
I’m assuming also that tensors using int64 by default won’t be autocasted? (only Ops are mentioned in docs.)
Would creating tensors directly in the desired type run much faster?

soulitzer · December 16, 2024, 5:22pm

So my simple question is: does it mean the tensors (input, weight and bias) are “casted” for the operations within the layer, but do not change the original tensor and hence they are a new copy created on the fly?

Yes. Autocast does have a cache that is enabled by default though to avoid creating extra copies, e.g. for example parameters.

I’m assuming also that tensors using int64 by default won’t be autocasted? (only Ops are mentioned in docs.)

That is correct

Would creating tensors directly in the desired type run much faster?

It depends on what proportion of runtime is taken up by those copies

Aknw_Fen · December 16, 2024, 5:30pm

Thanks; I realise about that after reading your comment in a different post.