Hello.
I’m training a regression task (output values between 0 and 100), and the inputs are images from plants. I’m using resnet18
here from torchvision
.
I realized the GPU is going in maximum to ~40% utilization, but the problem is that it usually stays at 0%.
So I thought there was a bottleneck in the data loading/preprocessing steps, and used python3 -m torch.utils.bottleneck src/train.py
to check.
Here are the results:
ncalls tottime percall cumtime percall filename:lineno(function)
121039 570.056 0.005 570.056 0.005 {method 'decode' of 'ImagingDecoder' objects}
26068 180.834 0.007 180.834 0.007 {method 'resize' of 'ImagingCore' objects}
1102194 104.741 0.000 104.741 0.000 {method 'read' of '_io.BufferedReader' objects}
26068 14.408 0.001 14.408 0.001 {built-in method PIL._imaging.new}
28838 7.336 0.000 7.336 0.000 {method 'to' of 'torch._C._TensorBase' objects}
8160 5.703 0.001 5.703 0.001 {built-in method torch.conv2d}
26071 4.836 0.000 4.836 0.000 {built-in method io.open}
66454 4.492 0.000 4.492 0.000 {method 'item' of 'torch._C._TensorBase' objects}
816 4.220 0.005 4.220 0.005 {built-in method torch.stack}
26068 2.800 0.000 2.800 0.000 {method 'contiguous' of 'torch._C._TensorBase' objects}
26068 2.656 0.000 2.656 0.000 {method 'close' of '_io.BufferedReader' objects}
326 2.234 0.007 2.234 0.007 {method 'run_backward' of 'torch._C._EngineBase' objects}
26068 2.161 0.000 2.161 0.000 {method 'div' of 'torch._C._TensorBase' objects}
52136 1.247 0.000 590.483 0.011 /home/igorf/.conda/envs/my-env/lib/python3.8/site-packages/PIL/ImageFile.py:155(load)
52136 0.999 0.000 4.623 0.000 /home/igorf/.conda/envs/my-env/lib/python3.8/site-packages/pandas/core/internals/managers.py:1027(fast_xs)
The “cumtime” shows high values for decode
, resize
, and read
, but I don’t know where this decode
is in my code. For resize
I suppose it comes from torchvision's Resize()
and for read
it must be from PIL
.
I would like to understand how I can make my model run faster from this output.
My dataset is as follows:
from PIL import Image
from torch.utils.data import Dataset
from torchvision import transforms
class CGIARDataset(Dataset):
def __init__(self, df, transform=None):
self.df = df
self.transform = transform
def __len__(self):
return len(self.df)
def __getitem__(self, idx):
y = self.df.iloc[idx]['extent']
img = Image.open(self.df.iloc[idx]['filename'])
x = transforms.ToTensor()(img)
if self.transform is not None:
x = self.transform(x)
return x, y
As you can see, I’m reading the image using PIL
, and converting to tensor using ToTensor
from torchvision
.
The resize step is in my transform object:
transform = transforms.Compose([
transforms.Resize(IMG_SIZE, antialias=True)
])
Could anyone give me some tips on this?
For example, should I change my image reading from PIL
to another library?
Or where does the decode
come from?
Thanks!