New to PyTorch here. trying to figure out BCELoss.,

Lets say I have a batch size of `6`

and I get single 1 or 0 value as output of my network. (actually sigmoid, using 1,0 for simplicity)

If I run :

```
criterion = nn.BCELoss()
prediction = torch.tensor([[1.],[0.],[0.],[1.],[0.], [0.]])
label = torch.tensor([[1.],[0.],[0.],[1.],[0.], [0.]])
loss = criterion(input=prediction,target=label)
print(loss.item())
```

```
0
```

I get loss value as `0`

which is perfectly fine, same if the batch size is 5 or some other value.

Now lets consider a case where only `1`

prediction in a batch is off.

For 6 elements:

```
criterion = nn.BCELoss()
prediction = torch.tensor([[1.],[0.],[0.],[1.],[0.], [0.]])
label = torch.tensor([[0.],[0.],[0.],[1.],[0.], [0.]])
loss = criterion(input=prediction,target=label)
print(loss.item())
```

```
4.605170249938965
```

For 5 elements:

```
criterion = nn.BCELoss()
prediction = torch.tensor([[1.],[0.],[0.],[1.], [0.]])
label = torch.tensor([[0.],[0.],[0.],[1.], [0.]])
loss = criterion(input=prediction,target=label)
print(loss.item())
```

```
5.5262041091918945
```

What am I doing wrong here to get different loss values for different batch sizes ?

Also how would one use weights to handle class imbalance here?

Lets say for every `n`

negative sample(s) I have `p`

positive sample(s).

Thanks.