I have been searching around and I cannot find any easy answers to how to dynamically calculate the output size of a set of convolutional layers. I see the formula here and most of the terms are obvious except for the dilation term.
I wasn’t used to working with convolutions. I forgot that the max pooling layer was what was causing most of the sizing issues and not the conv2d itself.