RuntimeError: mat1 and mat2 shapes cannot be multiplied in regression neural network

Bloopit · January 25, 2022, 1:23pm

I am trying to use a neural network to do regression on a large 5D dataset, however to try and get a working neural network first i have setup a random set of data points to try with. However I am running into this error and I am not sure why.

Error Code:
predicted = model(data.to(device))
File “C:\Users\Owner\anaconda3\lib\site-packages\torch\nn\modules\module.py”, line 1051, in _call_impl
return forward_call(*input, **kwargs)
File “C:\Users\Owner\anaconda3\lib\site-packages\torch\nn\modules\container.py”, line 139, in forward
input = module(input)
File “C:\Users\Owner\anaconda3\lib\site-packages\torch\nn\modules\module.py”, line 1051, in _call_impl
return forward_call(*input, **kwargs)
File “C:\Users\Owner\anaconda3\lib\site-packages\torch\nn\modules\linear.py”, line 96, in forward
return F.linear(input, self.weight, self.bias)
File “C:\Users\Owner\anaconda3\lib\site-packages\torch\nn\functional.py”, line 1847, in linear
return torch._C._nn.linear(input, weight, bias)
RuntimeError: mat1 and mat2 shapes cannot be multiplied (1x64 and 1x200)

Process finished with exit code 1

I was roughly following a tutorial I found online: Regression with Neural Networks in PyTorch | by Ben Phillips | Medium

My Code:

trial_x_data = np.arange(100)
trial_y_data = trial_x_data ** 2 + 5 

model = torch.nn.Sequential(
        torch.nn.Linear(1, 200),
        torch.nn.LeakyReLU(),
        torch.nn.Linear(200, 100),
        torch.nn.LeakyReLU(),
        torch.nn.Linear(100, 1),
    ).to(device)

optimiser = torch.optim.Adam(model.parameters(), lr=0.01)
loss_func = torch.nn.MSELoss()


Batch_size = 64
Max_Epoch = 500

trial_x_data, trial_y_data = torch.tensor(trial_x_data), torch.tensor(trial_y_data)
torch_dataset = Data.TensorDataset(trial_x_data, trial_y_data)

loader = Data.DataLoader(dataset=torch_dataset, batch_size=Batch_size, shuffle=True)

loss_logger = []

for epoch in range(Max_Epoch):
    for idx, (data, expected) in enumerate(loader):
        predicted = model(data.to(device))
        loss = loss_func(predicted, expected.to(device))

        optimiser.zero_grad()
        loss.backward()
        optimiser.step()

        loss_logger.append(loss.item())

    print("Epoch: [%d/%d]" % (epoch + 1, Max_Epoch))

Any help would be greatly appreciated

thecho7 · January 25, 2022, 1:44pm

Your input data should be transposed.

According to the basic matrix multiplication: (A, B) x (B, C) = (A, C)

for idx, (data, expected) in enumerate(loader):
    data = data.transpose(1, 0)
    predicted = model(data.to(device))

Bloopit · January 25, 2022, 4:00pm

Thanks for the reply, I’m not sure that is the root cause cos when I do transpose “data”, the error remains unchanged.

I have updated the error message so it might have more info.

thecho7 · January 25, 2022, 11:48pm

Would you print a shape of data by print(data.shape)?
If it has 3-dimensions including batch then data = data.transpose(2, 1)

if len(data.shape) == 3:
    # (A, B, C) --> (A, C, B)
    data = data.transpose(2, 1)
elif len(data.shape) == 2:
    # (A, B) --> (B, A)
    data = data.transpose(1, 0)
else:
    pass

Bloopit · January 26, 2022, 3:46am

The shape of the data is apparently: torch.Size([64])

thecho7 · January 26, 2022, 4:57am

Then it gonna work with data = data.unsqueeze(1).

Bloopit · January 26, 2022, 5:09am

I have done that, and I got this error afterwords: RuntimeError: expected scalar type Float but found Int
So I then added did data = data.unsqueeze(1).float(). But when I do this I get RuntimeError: Found dtype Int but expected Float in the loss.backward() part.

Thanks for all the help so far

thecho7 · January 26, 2022, 5:49am

MSE takes inputs of float type.
data = data.unsqueeze(1).type(torch.float)
expected = expected.unsqueeze(1).type(torch.float)

Then move them to GPU by .to(device)

Bloopit · January 26, 2022, 6:32am

Hi
That works thanks for all the help.

On a quick side questions, if I want to test my model with another set of x points, is this the right code: model(torch.tensor(x_points).to(device)).cpu().detach().clone().numpy().

Also how would you recommend improving the model.

Thanks

thecho7 · January 26, 2022, 12:39pm

I think clone is not necessary.
Your approach is simply ok.