"Illegal instruction (core dumped) in first PyTorch tutorial

I am brand-new to PyTorch and trying to get up to speed. I installed PyTorch on a 64 bit virtual machine running CentOS 7.5. Host processor is AMD 4 core Phenom IIx4 945.
Python 3.6.8
Gcc 4.8.5 20150603
No GPU
Installation instruction (from website)
pip3 install torch==1.3.1+cpu torchvision==0.4.2+cpu -f https://download.pytorch.org/whl/torch_stable.html

In the third section of the “60 minute blitz tutorial”, following all the instructions, I entered:

out.backward(torch.randn(1, 10))

And got a core dump.

out.backward(torch.randn(1,10))
Illegal instruction (core dumped)

Having the tutorial crash looks really bad ;^) Also interfering with my goal to learn PyTorch!

I’d appreciate help with this. Thanks.

1 Like

That’s indeed a bad first experience :confused:

Could you run the code in a terminal and try to get the stack trace via:

$ gdb --args python my_script.py
...
Reading symbols from python...done.
(gdb) run
...
(gdb) backtrace
...

In fact I was running the tutorial in a terminal on my local VM.

Here’s the beginning of the gdb backtrace. Let me know if you want the whole thing.

(I apologize in advance if this should be in a code-block. I don’t see how to do this.)

Thanks!

[New Thread 0x7fffd736d700 (LWP 14775)]
[New Thread 0x7fffd6b6c700 (LWP 14776)]

Program received signal SIGILL, Illegal instruction.
[Switching to Thread 0x7fffd736d700 (LWP 14775)]
0x00007fffe2ceffc1 in void mkldnn::impl::cpu::(anonymous namespace)::kernel_mxn<float, true, false>(int, float const*, long, float const*, long, float*, long, float, float) ()
from /home/goldin/.pyVirtual/lib/python3.6/site-packages/torch/lib/libtorch.so
Missing separate debuginfos, use: debuginfo-install bzip2-libs-1.0.6-13.el7.x86_64 glibc-2.17-222.el7.x86_64 keyutils-libs-1.5.8-3.el7.x86_64 krb5-libs-1.15.1-18.el7.x86_64 libcom_err-1.42.9-11.el7.x86_64 libffi-3.0.13-18.el7.x86_64 libgcc-4.8.5-28.el7.x86_64 libselinux-2.5-12.el7.x86_64 libstdc+±4.8.5-28.el7.x86_64 openssl-libs-1.0.2k-12.el7.x86_64 pcre-8.32-17.el7.x86_64 python3-libs-3.6.8-10.el7.x86_64 xz-libs-5.2.2-1.el7.x86_64 zlib-1.2.7-17.el7.x86_64
(gdb) backtrace
#0 0x00007fffe2ceffc1 in void mkldnn::impl::cpu::(anonymous namespace)::kernel_mxn<float, true, false>(int, float const*, long, float const*, long, float*, long, float, float) ()
from /home/goldin/.pyVirtual/lib/python3.6/site-packages/torch/lib/libtorch.so
#1 0x00007fffe2cf02d9 in void mkldnn::impl::cpu::(anonymous namespace)::block_ker<float, true, false>(int, int, int, float const*, long, float const*, long, float*, long, float, float, float*, bool) () from /home/goldin/.pyVirtual/lib/python3.6/site-packages/torch/lib/libtorch.so
#2 0x00007fffe2cf4287 in mkldnn_status_t mkldnn::impl::cpu::ref_gemm(char const*, char const*, int const*, int const*, int const*, float const*, float const*, int const*, float const*, int const*, float const*, float*, int const*, float const*)::{lambda(int)#2}::operator()(int) const ()
from /home/goldin/.pyVirtual/lib/python3.6/site-packages/torch/lib/libtorch.so

1 Like

Thanks for the stack trace!
Based on the stack trace, if seems mkldnn got an illegal instruction.
If I recall correctly, I’ve seen this issue on older CPUs, which didn’t support SSE2.
However, after quickly googling the specs of your CPU, it seems SSE is supported.
I’m not that experienced in AMD CPUs, so let’s wait for other opinions on this error and if it’s in fact even related to the chip.

PS: you can add code snippets by wrapping them into three backticks ``` :wink:

1 Like

Thank you Patrick, still hoping for some clarity.

An update:
– Created an Ubuntu-based VM on the same physical host (AMD-based) and installed PyTorch via Anaconda --> still crashes
– Moved this VM to a different host machine with a four core i3 Intel CPU --> problem goes away.

This suggests the problem is related to the underlying hardware. Not a pretty conclusion. Anyway, thanks for your help.

I have the same issue but when I typed backtrace it gave me:
no stack

Illegal instruction (core dumped)

This is last non-using avx pytorch version, that I have test, my cpu have no avx

pip3 install torch==1.8.1+cu111 torchvision==0.9.1+cu111 torchaudio==0.8.1 -f https://download.pytorch.org/whl/torch_stable.html