Hi,
torch version = 2.5.0
I am wondering whether torch.autocast
can handle neural networks with layers having different dtypes.
The following code suggests it cannot:
import torch
net = torch.nn.Sequential(
torch.nn.Linear(2, 10, dtype=torch.float16),
torch.nn.ReLU(),
torch.nn.Linear(10, 10, dtype=torch.float32),
torch.nn.ReLU(),
)
with torch.autocast("cuda"):
net(torch.as_tensor([[1., 2.]], dtype=torch.float16))
=> it raises RuntimeError: mat1 and mat2 must have the same dtype, but got Half and Float
But at some point recently I had started to believe autocast
was able to handle such cases.
I loaded a pretrained model on HuggingFace and changed just the dtype
of one layer. Without autocast, the same RuntimeError
was raised, but with autocast
the error disappeared.
Here is a Minimum Reproducible Example:
from transformers import pipeline
pipe = pipeline(
"text-classification",
model="Qwen/Qwen2.5-0.5B",
torch_dtype=torch.float16,
device_map="cuda"
)
pipe_model_named_parameters = {k: v for k, v in pipe.model.named_parameters()}
for p in pipe_model_named_parameters:
if "score" in p: # convert last trainable layer to float32 for stability during training
pipe_model_named_parameters[p].data = pipe_model_named_parameters[p].data.to(dtype=torch.float32)
with torch.autocast("cuda"): # context manager needed in case not all layers have the same dtype
print(pipe(["a", "b", "z"]))
Any insight on what autocast
allows and does not allow ?