Is it known that if you do `torch.sum(y_hat == y)`, if the sum is larger than 255, the sum will be whatever - 255? I am seeing this behavior with the conda version of PyTorch. If you do `torch.sum((y_hat == y).float())` then it is fine.

What is the your verson of PyTorch?

``````print(torch.__version__)
I tried it in PyTorch 0.1.12_2 and found both are fine.

``````>>> x = torch.autograd.Variable(torch.LongTensor(*300))
>>> torch.sum(x == y)
Variable containing:
44
[torch.ByteTensor of size 1]

>>> print(torch.__version__)
0.1.12_2
>>> torch.sum((x == y).int())
Variable containing:
300
[torch.IntTensor of size 1]``````

As you observed, the comparison operators return `ByteTensor`. I would even recommend to use `.long()` to convert to a `LongTensor`. You are safer from overflow even if you do calculations but more importantly it is the type that is most often used in pytorch when things expect integers (e.g. indexing, `.gather`,…).
If you need to do things with other tensors `.float()` is often needed (e.g. for batch averaging) with a mask or so as you can’t use operators with different tensor types in general.

The point is that if the Variable wrapping is causing the different behavior. For example:

``````>>> x = torch.LongTensor(*300)
>>> y = torch.LongTensor(*300)
>>> x = x.cuda()
>>> y = y.cuda()
>>> torch.sum(x == y)
300
``````

I am not entirely sure if they should have the same behavior with and without the Variable wrapper, but I am just trying to put it out there.

The difference is actually whether it becomes a python int or a Tensor again.
With `(x==y).sum(1)` you get the overflow with tensors. Now, `Variables` never are converted to python numbers (because it would lose autograd).

