(more) Dilation support for MaxPool / AvgPool?

Currently, AvgPool1d doesn’t support dilation at all. While MaxPool1d do support dilation, it’s limited by:

In [6]: MaxPool1d(2,stride=1,padding=2,dilation=2)(torch.rand(2,4,8))
---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
Cell In[6], line 1
----> 1 MaxPool1d(2,stride=1,padding=2,dilation=2)(torch.rand(2,4,8))

File ~/p10/lib/python3.10/site-packages/torch/nn/modules/module.py:1501, in Module._call_impl(self, *args, **kwargs)
   1496 # If we don't have any hooks, we want to skip the rest of the logic in
   1497 # this function, and just call forward.
   1498 if not (self._backward_hooks or self._backward_pre_hooks or self._forward_hooks or self._forward_pre_hooks
   1499         or _global_backward_pre_hooks or _global_backward_hooks
   1500         or _global_forward_hooks or _global_forward_pre_hooks):
-> 1501     return forward_call(*args, **kwargs)
   1502 # Do not call functions when jit is used
   1503 full_backward_hooks, non_full_backward_hooks = [], []

File ~/p10/lib/python3.10/site-packages/torch/nn/modules/pooling.py:92, in MaxPool1d.forward(self, input)
     91 def forward(self, input: Tensor):
---> 92     return F.max_pool1d(input, self.kernel_size, self.stride,
     93                         self.padding, self.dilation, ceil_mode=self.ceil_mode,
     94                         return_indices=self.return_indices)

File ~/p10/lib/python3.10/site-packages/torch/_jit_internal.py:484, in boolean_dispatch.<locals>.fn(*args, **kwargs)
    482     return if_true(*args, **kwargs)
    483 else:
--> 484     return if_false(*args, **kwargs)

File ~/p10/lib/python3.10/site-packages/torch/nn/functional.py:696, in _max_pool1d(input, kernel_size, stride, padding, dilation, ceil_mode, return_indices)
    694 if stride is None:
    695     stride = torch.jit.annotate(List[int], [])
--> 696 return torch.max_pool1d(input, kernel_size, stride, padding, dilation, ceil_mode)

RuntimeError: max_pool1d() padding should be at most half of kernel size, but got padding=2 and kernel_size=2

Which I think do makes sense since adding dilation will “eat” more output, which should allow more padding as compensation.

BTW, it would even be better to allow one sided padding (instead of both sided).