I recently move some code from `pytorch 0.4.1`

to `pytorch 1.0.0`

and found something wrong on `torch.nn.MSELoss()`

.

For example, I have two tensor named `a`

and `b`

, and I process them under `pytorch 0.4.1`

and `pytorch 1.0.0`

. Here is the code. Notice that the difference when I use `reduction="elementwise_mean"`

.

```
# code under pytorch 0.4.1
>>> a.shape
torch.Size([48, 3, 128, 128])
>>> b.shape
torch.Size([48, 3, 128, 128])
>>> loss1 = nn.MSELoss(reduction="sum")
>>> loss1(a,b)
tensor(1416916.3750, device='cuda:0', grad_fn=<SumBackward0>)
>>> loss2 = nn.MSELoss(reduction="elementwise_mean")
>>> loss2(a,b)
tensor(1416916.3750, device='cuda:0', grad_fn=<SumBackward0>)
```

```
# code under pytorch 1.0.0
>>> a.shape
torch.Size([48, 3, 128, 128])
>>> b.shape
torch.Size([48, 3, 128, 128])
>>> loss1 = nn.MSELoss(reduction="sum")
>>> loss1(a,b)
tensor(1416916.3750, device='cuda:0', grad_fn=<SumBackward0>)
>>> loss2 = nn.MSELoss(reduction="mean")
>>> loss2(a,b)
tensor(0.6006, device='cuda:0', grad_fn=<MeanBackward1>)
```

Is it really a bug in `pytorch 0.4.1`

?