Pytorch crashing while trying to iterate through a loaded training set

noobdog_170 · October 13, 2017, 9:33am

I am using the code:

import torch
import torchvision
import torchvision.transforms as transforms

transform = transforms.Compose(
    [transforms.ToTensor(),
     transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))])

trainset = torchvision.datasets.CIFAR10(root='./data', train=True,
                                        download=True, transform=transform)

trainloader = torch.utils.data.DataLoader(trainset, batch_size=4,
                                          shuffle=True, num_workers=2);

dataiterator = iter(trainloader)

However once the program reaches the line:
dataiterator = iter(trainloader)
It crashes and asks me twice whether I want to kill the python program(Like I pressed the close window button). The program is then unable to continue and hangs, this problem still persists after changing the:
iter()
to a for loop:
for image, label in trainloader
It always crashes the program. Additionally every time it asks to close it leaves a python3 program running in the background and they each use about 350MB. I am using linux mint 18.1 with a lenovo thinkpad x230. Has anybody come across this error before?

albanD · October 13, 2017, 9:48am

What do you mean by “the program crashes” exactly?

You may want to change num_workers=1, this may solve your issue.

noobdog_170 · October 13, 2017, 12:17pm

The error shown below keeps popping up whenever I run the program.

The num_workers=1 Didn’t work

noobdog_170 · October 13, 2017, 4:46pm

Tried it on another computer running ubuntu 16.04 and the same issue occurs. It crashes when reaching the trainloader.

noobdog_170 · October 14, 2017, 1:53pm

After some trial and error, removing the line num_workers completely fixes the issue. Does anybody know what the significance of removing this line is? Looking online it seems to be related to memory, I have 16GB of RAM and python doesn’t consume very much when running the program. It seems to be similar to this issue https://github.com/pytorch/pytorch/issues/1355

being_saurabh · December 22, 2018, 3:40pm

In my case,
this problem occurred with pyTorch v1.0 as per the latest release, but degrading and installing the version to v0.4.0, solved for me.
Thanks!

PorkPy · February 14, 2020, 11:31am

Thank you. This solved my similar issue where the execution would just hang when calling iter().
Changing ‘num_workers’ to 1 had no affect.
I’m using PyTorch 1.4.0 on Ubuntu 16.04.