Unable to sum the result of an equality test

I am trying to check the count of element-wise equality between two tensors. I have narrowed my issue down to the following short example. The last line results in an “Illegal instruction” message and crashing out of Python.

import torch

torch.manual_seed(1)
x = torch.randint(0, 5, (1000, ))

x.eq(x).sum()

I am using Python 3.6.4 in iPython 6.5.0 with torch 0.4.1 on Windows 10.

I can’t repro this on Linux but I will open an issue on GitHub for you: https://github.com/pytorch/pytorch/issues/10483

1 Like

This sounds like the CPU capability dispatch code might not be working properly on Windows. Do you know what model CPU you have?

1 Like

Thanks all.

I am using a VMware virtual machine - according to the system information within the virtual machine, I have an Intel Xeon CPU E5-2680.

I’m curious if the following works (in a new iPython process):

import os
os.environ['ATEN_DISABLE_AVX2'] = '1'

import torch

torch.manual_seed(1)
x = torch.randint(0, 5, (1000, ))

x.eq(x).sum()

Still got the same error.

By the way, it doesn’t seem to matter what x is. I used a random number generator, but you could replace x with x = torch.ones(5) or x = torch.ones(5, 5) and still get the same error.

I think the issue is that the sum() call is running a kernel that uses AVX2 instructions, but the CPU doesn’t support AVX2 instructions (only AVX).

There are two likely causes:

  1. The CPU capability detection code isn’t working on Windows (or maybe the VM?) and incorrectly thinks the CPU supports AVX2 instructions
  2. The library linking behaves differently on Windows and is causing the AVX2 kernel to be run when the AVX kernel is called.

I have the same crash problem in caffe2.dll while calling tensor.sum().

My environment is win7 + python 3.6 + pytorch 0.4.1.
CPU is Intel Pentium which does not support AVX or AVX2 instruction set.

We are compiling caffe2.dll with AVX and AVX2 instruction set. So if your CPU doesn’t support it, you may have to build it yourself.