Does autocast create copies of tensors on the fly?

From autocast docs, it appears that:

  • Ops will autocast within an autocast block - if the Ops support it.

But tensors don’t change type, see example below, so I assume a copy must be created.

So, for example here:

import torch

l  = torch.nn.Linear(10,12)
t = torch.randn((10,1))

with torch.autocast(device_type="cpu"):
  print(l(t.transpose(1,0)).type())
  print(next(l.parameters()).type())

The result of appplying the linear layer l to the tensor t is in half or bhalf precision, but the parameters are all in float32 (default) unless one explicitly sets them to other dtype.

  • So my simple question is: does it mean the tensors (input, weight and bias) are “casted” for the operations within the layer, but do not change the original tensor and hence they are a new copy created on the fly?

  • I’m assuming also that tensors using int64 by default won’t be autocasted? (only Ops are mentioned in docs.)

  • Would creating tensors directly in the desired type run much faster?

  • So my simple question is: does it mean the tensors (input, weight and bias) are “casted” for the operations within the layer, but do not change the original tensor and hence they are a new copy created on the fly?

Yes. Autocast does have a cache that is enabled by default though to avoid creating extra copies, e.g. for example parameters.

I’m assuming also that tensors using int64 by default won’t be autocasted? (only Ops are mentioned in docs.)

That is correct

  • Would creating tensors directly in the desired type run much faster?

It depends on what proportion of runtime is taken up by those copies

1 Like

Thanks; I realise about that after reading your comment in a different post.