What is the dilation in the convolutional layer output size formula?

jwillette · January 16, 2020, 8:52am

I have been searching around and I cannot find any easy answers to how to dynamically calculate the output size of a set of convolutional layers. I see the formula here and most of the terms are obvious except for the dilation term.

With stride=1, kernel=3

I am using omniglot

inputs: (batch, 1, 28, 28)
outputs: (batch, 64, 1, 1)
dialtion: 14.5???

and imagenet

inputs: (batch, 3, 84, 84)
outputs: (batch, 64, 5, 5)
dilation: 40.5???

These dont seem to make sense

albanD · January 16, 2020, 3:24pm

Hi,

This is usually dilation=1 for most models. Where did you get these numbers from?

Also this blogpost has a nice visualization of what this parameter is doing.

jwillette · January 18, 2020, 9:55am

I wasn’t used to working with convolutions. I forgot that the max pooling layer was what was causing most of the sizing issues and not the conv2d itself.