Unfold + Linear + Fold != Conv2d

This post as well as the docs show an example code snippet showing the same results between a manual unfold approach and the corresponding conv layer. Did you check each layer separately making sure the outputs are equal up to floating point precision?