How long does it take to train the network in the CIFAR-10 tutorial?

benlundell · April 17, 2017, 5:17pm

Hello- I’m working through the CIFAR-10 tutorial and I am wondering how long it usually takes to run the training. In CPU mode, it is consistently taking my machine 1h10m to complete the two epochs in the example. Are other people seeing similar times?

smth · April 17, 2017, 5:22pm

1h10m, oh my that’s a lot. It takes me about a minute.

What’s your OS and version?
How did you install pytorch?
Are you running this in a virtual machine, or on bare metal?

I hope that with this information, I can reproduce it.

benlundell · April 17, 2017, 5:27pm

Thanks for the speedy reply! Here’s my info:

OS- Ubuntu 14.04.1
Python- 3.6.1
Install- Pulled the repo and built it using "python setup.py install"
VM- No, but I am doing this through Xfce from a Windows box, if that matters.

smth · April 17, 2017, 5:47pm

Your install method is the reason why it’s so slow.
You have two options:

Install the binary from pytorch.org (we provide wheel files and conda binaries)
Install a proper BLAS library on your machine (MKL or OpenBLAS, MKL is available via Conda) and add it to the environment variable CMAKE_PREFIX_PATH. Otherwise pytorch will use super-unoptimized BLAS, and that’s the reason for the slowness. You can read one way of doing this here: https://github.com/pytorch/pytorch#from-source

benlundell · April 17, 2017, 8:27pm

I tried both options, and neither seems to make a difference However, I found an issue with torchvision when installing via conda. Where should I report that?

smth · April 17, 2017, 8:55pm

report torchvision issues to https://github.com/pytorch/vision

benlundell · April 17, 2017, 9:06pm

Thanks. I’ve opened an issue here: https://github.com/pytorch/vision/issues/151

smth · April 17, 2017, 9:08pm

another reason for your slowdown might be a large number of CPU cores.
Try setting these environment variables:

export OMP_NUM_THREADS=4
export MKL_NUM_THREADS=4

See if that fixes it?

benlundell · April 17, 2017, 9:28pm

That did it for me. The training just completed in less then 2 minutes. Thanks for all the help!

Brando_Miranda · March 1, 2018, 4:40am

why did that fix it? Just curious.

RylanSchaeffer · December 13, 2021, 5:40am

What effect does setting those two environment variables have on the speed of training?

export OMP_NUM_THREADS=4
export MKL_NUM_THREADS=4