PIL GaussianBlur vs tfv.gaussian_blur

Hi, I want to use torchvision’s gaussian_blur instead of PIL’s gaussian blur; in pil you have one sigma input; how can I translate that sigma into kernel_size and sigma of torchvision, also are the paddings the same?

It seems like an easy question but so far I couldn’t figure out the exact parameters even with visualization (btw, I only care about sigma when it’s between 0.1 and 2.0 (self-supervised data augmentation))
I also found this question on StackOverflow.

The reason I’m asking is that I used PIL to train a MoCov2 on imagenet for 35 epochs and it’s performing almost 2% (absolute) better than the one with torchvision

and yes, I’m using different blurs for different images of the minibatch, so the only difference is the blur function

The question is more what is the (effective) kernel size for PIL’s GaussianBlur.
For any given radius, you can check experimentally:

img = torch.zeros(3, 20, 20)
img[:, 10, 10] = 1

img_pil = torchvision.transforms.functional.to_pil_image(img)
#?? torchvision.transforms.functional.gaussian_blur

img_blur_pil = img_pil.filter(PIL.ImageFilter.GaussianBlur(radius=2))
img_blur_pil_t = torchvision.transforms.functional.to_tensor(img_blur_pil)

print((img_blur_pil_t[0]>0).any(dim=1).sum())

img_blur = torchvision.transforms.functional.gaussian_blur(img, 11, 2)
print((img_blur[0]>0).any(dim=1).sum())

In general, this is a complex topic, the PIL Documentation references a 12-page article (Gwosdek et al: Theoretical Foundations of Gaussian Convolution by Extended Box Filtering) for their implementation accuracy.

Best regards

Thomas

1 Like