Hello! I am having some issues on using batch norm. I am in the beginning of building my NN so for now I am using 100 samples for training and I want to overfit to it, just to make sure that the network can learn. The input and output are bout 1500 each. Here is my network:
class GW_NN(nn.Module):
def __init__(self):
super().__init__()
self.linear1 = nn.Linear(n_inp, 2000)
self.linear2 = nn.Linear(2000, 2000)
self.linear3 = nn.Linear(2000, 2000)
self.linear4 = nn.Linear(2000, 2000)
self.linear5 = nn.Linear(2000,n_out)
self.bn = nn.BatchNorm1d(2000)
def forward(self, x):
x = F.softplus(self.bn(self.linear1(x)))
x = F.softplus(self.bn(self.linear2(x)))
x = F.softplus(self.bn(self.linear3(x)))
x = F.softplus(self.bn(self.linear4(x)))
x = self.linear5(x)
return x
model_gw = GW_NN().cuda()
lrs = 1e-2
optimizer_gw = optim.Adam(model_gw.parameters(), lr = lrs)
for epoch in range(10001):
model_gw.train()
for i, dtt in enumerate(my_dataloader):
optimizer_gw.zero_grad()
inp = dtt[0].float().cuda()
output = dtt[1].float().cuda()
loss = F.mse_loss(model_gw(inp),output)
loss.backward()
optimizer_gw.step()
if epoch%100==0:
print(loss.data.cpu().numpy())
The loss goes down okish:
29418.57
20.279129
11.549426
8.563468
8.235117
8.161551
9.561671
7.5749683
7.60303
7.609553
7.265949
7.824227
10.810941
7.803124
7.6215243
7.977992
7.9355087
7.574047
7.326716
But when I want to try the trained NN (on the same data used for training) it fails:
idx = 10
y_real = output_data_phi[idx].data.cpu().numpy()
model_gw.eval()
y_pred = model_gw(input_data)[idx].data.cpu().numpy()
print(((y_pred-y_real)**2).mean())
I am getting 169114.33
. I assume that the problem is with using batch norm in eval mode, but I am not sure what to do. Can someone help me? Thank you!