I was trying to use TORCH.NN.FUNCTIONAL.LINEAR on my model. However, I got an error message saying that “mat1 and mat2 must have the same dtype”. It is just a linear function, I don’t get why the matrices have to be in the same dtype. Thank you for any reply, it will help me to gain a better understanding.
The internal calls expect to get data in the same dtype
. I guess you might be manually casing the input to this linear layer to another dtype
than its parameters, so cast it back or could you explain what’s your use case and why you expect the dtype
mismatch to work?
Thanks for your reply. So what is the expected dtype requires? I only have one input X (a tensor of floats) to the linear function.
Could you tell me what is mat1 and mat2 is? I suppose it is x and transpose of A in the linear equation?
mat1
and mat2
refer most likely to the input tensor and the weight matrix of the linear layer.
Here is a small example showing one way to run into this error which is caused by the dtype
mismatch between the input tensor and the layer’s parameters:
# initialzie linear layer
linear = nn.Linear(10, 1, bias=False)
# by default float32 is used as the dtype
print(linear.weight.dtype)
# torch.float32
# create input tensor
x = torch.randn(10, 10)
# by default float32 is also used
print(x.dtype)
# torch.float32
# linear layer works and output dtype is also float32
out = linear(x)
print(out.dtype)
# torch.float32
# transform to float64
x = x.to(torch.float64)
print(x.dtype)
# torch.float64
# create dtype mismatch
out = linear(x)
# RuntimeError: expected scalar type Double but found Float
# same for an explicit matmul
out = torch.matmul(x, linear.weight.T)
# RuntimeError: expected scalar type Double but found Float
I think I got it now. Apprecite your help!
Also, in case you are trying to use mixed-precision training, use the util. functions from torch.amp
as e.g. the autocast
context will cast the tensors to the appropriate types for you.
i also faced same problem
i dont know what could be the issue here
it give me this error:
RuntimeError: mat1 and mat2 must have the same dtype
I guess your input might be using float64
while the model’s parameters use float32
?
Could you check and fix it if that’s the case?
it was indeed the problem,Thanks
am trying this but am geting an error
RuntimeError: mat1 and mat2 must have the same dtype
Any posible solution
class ClassNa(nn.Module):
# create the methods
def init(self, in_size, hidden1_size, num_class):
super(ClassNa, self).__init__()#
self.fc1 = nn.Linear(in_size, hidden1_size)
self.relu1 = nn.ReLU()
self.fc2 = nn.Linear(hidden1_size, num_class)
def forward(self, x):
"""this is for propagation"""
out = F.relu(self.fc1(x))
out = self.relu1(out)
out = self.fc2(out)
return out
model= ClassNa(3,6,2)
model.to(device)
x,y= next(iter(train_dataload))
x= x[:4].to(device)
score =model(x)
print(score)
RuntimeError: mat1 and mat2 must have the same dtype
Check the dtype
s via model.fc1.weight.dtype
and x.dtype
, and make sure both are equal. If not, transform either the input to the model dtype
or vice versa.
Thanks
I used
Out= x.view(size (0),-1)
In the in the forward function
The view
operation does not change the dtype
of the tensor and will thus also not solve the issue.
Check the .dtype
attribute and make sure the parameters as well as the input are using the same.
Great man. you are very helpful. thanks