I have an image of size 2000 x 2000 pixels. I need to find the average from all the 200 x 200 windows. When I try in Matlab to find the average I am able to get the results quickly. However, when I use pytorch it takes a very long time/hangs. How can I quickly find the average of all the windows of size 200x200?
The approach I tried in Pytorch is:
I = torch.rand(2000,2000)
pool = nn.AvgPool2d(kernel_size=200, stride=1, padding=100)
re = pool(I.unsqueeze(0))
You could push the data and pooling layer to the GPU for a potential speedup.
Note however, that you might fallback to the native im2col implementation for this particular use case, if e.g. cudnn cannot find a fast kernel for this workload.
I am getting an error “Cannot re-initialize CUDA in forked subprocess. To use CUDA with multiprocessing, you must use the ‘spawn’ start method”. Could you suggest me the cuda code for the image and layer?
You don’t need to use multiprocessing manually and can simply set the num_workers in your DataLoader to a larger number than 0.
This will use multiprocessing under the hood and each worker will load a complete batch in the background.
In that case the error might be raised, if you create CUDATensors in your Dataset.
The vanilla use case would be to create CPU tensors in the dataset, process them, and push them to the GPU inside your DataLoader loop.
def __getitem__(self, index):
target = self.lbls[index]
# read image and convert to PIL image
I = skimage.io.MultiImage(self.fnArr[index])[-1]
I = TF.to_pil_image(I, mode='RGB')
# apply transformations, including totensor()
I = self.transform(I)
# here I need to transfer I to GPU so that I can apply pooling
I = I.cuda() # I am getting error here
re = self.pool(I)
I need to do pooling to do some preprocessing of the image I.
Don’t push the I tensor to the device in your __getitem__ method.
Since each worker in your DataLoader will use an own dataset, you will run into these multiprocessing issues as mentioned before.
Remove I = I.cuda() from __getitem__ and move it to the training loop: