Num_workers and cuda speed confusion

cpiscos · June 28, 2018, 9:54pm

Hi, I’m trying to get a feel for pytorch as a relatively new programmer using the iris dataset but I have an issue. I was running a few tests on training speed regarding CPU/GPU and num_workers and I get some interesting results.

CUDA, num_workers=0
2.430s
CUDA, num_workers=1
18.741s
CPU, num_workers=0
1.619s
CPU, num_workers=1
11.038s

As you can see it seems faster to run on the CPU with no subprocesses which shouldn’t be the case. Can someone explain why or if my implementation is wrong?

Here’s my code/model:

github.com

cpiscos/pytorch_projects/blob/master/iris.py

import pandas as pd
import torch
from torch.utils.data import Dataset, DataLoader
import numpy as np
import matplotlib.pyplot as plt
import time

device = torch.device('cuda')


class IrisData(Dataset):
    def __init__(self):
        xy = pd.read_csv('data/iris.data', header=None)
        xy[4] = xy[4].astype('category')
        self.x = xy.iloc[:, :4].values
        self.x_mu = np.mean(self.x, axis=0)
        self.x_std = np.std(self.x, axis=0)
        self.x = torch.from_numpy((self.x - self.x_mu) / self.x_std)
        self.y = torch.from_numpy(np.array(xy[4].cat.codes, dtype='int64'))

This file has been truncated. show original