How to dilate a mask (making mask bigger by n pixels)? I am currently using `torch.unfold` but it runs out of memory easily

The following is my method using unfold.

boundary_width = 9
mask = mask.unfold(2, boundary_width, 1).unfold(3, boundary_width, 1)
mask = mask.contiguous().view(*mask.size()[:-2], -1)
mask = (mask.max(4)[0] > 0).float() * 1

However, it runs out of memory very quickly if the boundary_width is large (say 49 pixels). Is there any other way to do this? Or is there any way to reduce unfold memory usage?

I am hoping to be able to use it with ~49 pixels.