Does PyTorch replicate data for processing convolutional layers?

I wonder how PyTorch implements convolutional layers to ensure reasonable performance.

  1. How are input feature maps stored in memory?
  2. Is data replicated when applying filters to different patches of the input?

You can have a look at Convolution.cpp which uses some switches to decide, which backend to call or if to use the ATen implementation.
E.g. if you are using the GPU, in some cases cudnn will be used, which itself uses different algorithms depending on the data type and shape. If you have a static input shape you could use torch.backends.cudnn.benchmark=True to let cudnn chose the fastest algorithm for your work load.

1 Like