Record number of samples already traversed within training loop

MrRobot · November 15, 2019, 5:46pm

Question

I am new to deep learning, and I am not sure if I understand the key concepts correctly.

I have a huge dataset of more than 1.8 million samples. The training runtime could be hours. So I decide track number of samples that have already been touched by the training loop.

Specifically, I have following variables related to this purpose

print_freq = 10
batch_size = 32
max_epoch = 5

So if I write my training loop in the following way

for epoch in range(max_epoch):
  for i, (X_train, y_train) in enumerate(dataloader):
   # do something:
   if (i + 1) % print_freq == 0:
     num_samples_traversed = ...

Then is it correct that num_samples_traversed = print_freq * batch_size.

beaupreda · November 15, 2019, 6:15pm

Hi,

There is a small mistake in the way you compute num_samples_traversed. Right now, it will always be equal to 320. You need to accumulate print_freq * batch_size like this:
num_samples_traversed += print_freq * batch_size
or you could use vatriable i to compute it:
num_samples_traversed = (i+1) * batch_size