I am using the nn.Batchnorm layer on an autoencoder
self.enc1 = nn.Linear(28 * 28, 1000)
self.enc1bn = nn.BatchNorm1d(1000)
self.enc2 = nn.Linear(1000, 1000)
self.enc2bn = nn.BatchNorm1d(1000)
self.enc3 = nn.Linear(1000, 1000)
self.enc3bn = nn.BatchNorm1d(1000)
self.bottleneck = nn.Linear(1000, 1000)
self.dec1 = nn.Linear(1000, 1000)
self.dec1bn = nn.BatchNorm1d(1000)
self.dec2 = nn.Linear(1000, 1000)
self.dec2bn = nn.BatchNorm1d(1000)
self.dec3 = nn.Linear(1000, 1000)
self.dec3bn = nn.BatchNorm1d(1000)
self.dae_out = nn.Linear(1000, 28 * 28)
This seems to affect the training and gives a poorer reconstruction of the input, compared to when I am not using batchnorm. My layer structure is : Linear -> Batchnorm -> RELU .
Any ideas why this is ?