Changing the order of operations from "convolution, batch normalization, and activation" to "batch normalization, activation, and convolution" on a ResNet arhitecture makes the model perform very poorly

ptrblck · August 21, 2021, 3:55am

Is this post different from this post or are both tackling the same issue?