modeler
(Charles)
#1
When you define the loss function has MSE,

i.e.,

cost_func = nn.MSELoss()

and then you want to compute the loss using

loss = cost_func(predicted_labels, training_labels)

Would it be an issue if my predicted_labels tensor size is torch.Size([1500, 1, 1]) and the training_labels size is torch.Size([1500])?

The target tensor should be automatically broadcasted to the necessary shape:

```
criterion = nn.MSELoss()
criterion(torch.randn(10, 1, 1), torch.randn(10))
```

modeler
(Charles)
#3
Hm it seem that I am getting different answers when I have different shapes vs. same shape.

modeler
(Charles)
#4
Try this code

import torch

from torch.autograd import Variable

import numpy as np

import torch.nn as nn

torch.manual_seed(7)

a = torch.Tensor((1,2,3))

b = torch.Tensor((4,5,6))

criterion = nn.MSELoss()

print(criterion(a,b))

a = torch.Tensor((1,2,3))

a = a.reshape(3,1)

b = torch.Tensor((4,5,6))

criterion = nn.MSELoss()

print(criterion(a,b))

The output is

torch.Size([3])

torch.Size([3])

tensor(9.)

torch.Size([3, 1])

torch.Size([3])

tensor(10.3333)

Yes, the results will differ and I guess you would like to get the first result.

In your second example, `a`

will be broadcasted if you call `(a-b)`

to:

```
tensor([[-3., -4., -5.],
[-2., -3., -4.],
[-1., -2., -3.]])
```

, i.e. each row of `a`

will be subtracted by `b`

.