My training code looks over complicated.
I searched for a good training example, but the heritage from the versions before <0.4 provides lot of noise for my search.
I created the example to explain what I mean:
# set the min-batch size, optimizer and the loss function
bs=512
opt = optim.Adam(m.parameters(), 0.0001)
loss_fn = nn.NLLLoss()
# set the number of epochs
num_epochs = 1000
# from data loader take the generator it
it = iter(dl)
# grap the first mini batch
mb, yt = next(it)
# train the model for num_epochs
for epoch in range(num_epochs):
# all good when I still have the examples in dataloader, but
# bs will be <512 at certain point doing the next() when this happens I will get the exception
if(mb.shape[0]==bs): #bs=512
tup = torch.unbind(mb, dim=1)
# Forward pass to calculate the prediction
y_hat = m(*tup)
# loss evaluation
loss = loss_fn(y_hat, yt)
# Backward and optimize
opt.zero_grad()
loss.backward()
# update params
opt.step()
#next
mb, yt = next(it)
else:
it = iter(dl)
mb, yt = next(it)
if (epoch+1) % 50 == 0:
print ('Epoch [{}/{}], Loss: {:.4f}'.format(epoch+1, num_epochs, loss.item()))
This looks so complicated, since I need to pay attention on remaining size of the dataloader because next()
will return at certain point less than bs examples from dataloader.
OK, I could set DataLoader suffle=True
. In that case I could use mb, yt= next(iter(dl))
all the time.
Any feedback on training approach would be helpful for me at this moment.