# The size of tensor a (10) must match the size of tensor b (100) at non-singleton dimension 1

Hi there, im following tutorial of simple neural network with 2 hidden layers and I wanted to add MSE and CE charts at the end of file, however i have no idea what should i do, any ideas?

``````import torch
import torchvision
import torch.nn as nn
import torch.nn.functional as f
from torchvision import datasets, transforms
import sklearn
import numpy as np
import pandas as pd
import sklearn.metrics
import matplotlib.pyplot as plt

torch.manual_seed(101)

Transform = transforms.ToTensor()

class Model(nn.Module):
def __init__(self, input_size=784, output_size=10, layers=[120,84]):
super().__init__()
self.d1 = nn.Linear(input_size, layers[0])
self.d2 = nn.Linear(layers[0], layers[1])
self.d3 = nn.Linear(layers[1], output_size)

def forward(self, X):
X = f.relu(self.d1(X))
X = f.relu(self.d2(X))
X = self.d3(X)
return f.log_softmax(X, dim=1)

model = Model()

ce = nn.CrossEntropyLoss()
mse = nn.MSELoss()

train_losses = []
train_mse_losses = []
test_losses = []
test_mse_losses = []

train_correct = []
test_correct = []

for i in range(10):
trn_cor = 0
tst_cor = 0
for b, (X_train, y_train) in enumerate(trainset):
b+=1
y_pred = model(X_train.view(100, -1))
loss = ce(y_pred, y_train)
loss_mse = mse(y_pred, y_train)

predicted = torch.max(y_pred.data, 1)[1]
batch_cor = (predicted==y_train).sum()
trn_cor += batch_cor

loss.backward()
loss_mse.backward()
optimizer.step()

if b%600 == 0:
print(f'epoch: {i:2} train loss: {loss.item():10.6f}')

train_losses.append(loss)
train_mse_losses.append(loss_mse)
train_correct.append(trn_cor)

for b, (X_test, y_test) in enumerate(testset):
y_val = model(X_test.view(500, -1))

predicted = torch.max(y_val.data, 1)[1]
tst_cor += (predicted == y_test).sum()

loss = ce(y_val, y_test)
loss_mse = mse(y_val, y_test)
test_losses.append(loss)
test_mse_losses.append(loss_mse)
test_correct.append(tst_cor)

print(f'test acc: {test_correct[-1].item()*100/10000:.3f}%')

plt.subplot(3,1,1)
plt.plot(train_losses, label='training loss')
plt.plot(test_losses, label='validation loss')
plt.title('Loss at the end of each epoch')

plt.subplot(3,1,3)
plt.plot([t/600 for t in train_correct], label='training acc')
plt.plot([t/100 for t in train_correct], label='validation acc')
plt.title('Accuracy at the end of each epoch')

plt.legend()
``````

Everything was working until I’ve added

``````mse = nn.MSELoss()
``````

Hi Igor!

Your problem is that `CrossEntropyLoss` and `MSELoss` work rather
differently (both conceptually and mechanically).

`CrossEntropyLoss` expects its `input` (your `y_pred`) to be a set of
logits (see below) for the classes your model outputs, typically of
shape `[nBatch, nClass]` (in your case `[100, 10]`), and its `target`
(your `y_train`) to be integer class labels, typically of shape `[nBatch]`
(no `nClass` dimension) with values that range from `0` to `nClass - 1`.

In contrast, `MSELoss` expects its `input` and `target` to have the same
shape as one another and uses mean-squared-error to measure, on
average, how close the individual elements of `input` and `target` are
to one another. You are passing an `input` of shape `[100, 10]` and a
`target` of shape `[100]` to `MSELoss`, hence the error.

But furthermore, conceptually, a network that is trained to perform
classification outputs logits (or other probability-like values) for each
of the classes. In the typical use case, these values are not naturally
comparable to other numbers, which is to say, using them with
`MSELoss` doesn’t really make sense.

You need to think about the meaning of your `y_train` values (and
similarly `y_pred`) and whether it even makes sense to use them
with `MSELoss`.

As an aside, it’s a little more efficient to go `backward()` through your
network just once (per optimizer step), so you could rewrite this as

``````        loss_total = loss + loss_mse
loss_total.backward()
# or just  (loss + loss_mse).backward()
``````

Note that `CrossEntropyLoss` has `log_softmax()` built into it, so you
don’t want this. `CrossEntropyLoss` takes raw-score logits that run
from `-inf` to `inf` and are typically just the output of your final `Linear`
layer.

If you want that final `log_softmax()` you should use `NLLLoss` instead
pf `CrossEntropyLoss`. (`CrossEntropyLoss` is just `log_softmax()`
and `NLLLoss` put together for you.)

Best.

K. Frank