Let’s say I want to experiment with large batch size that does not fit in memory.
Would batch accumulation be equivalent to running model with a large batch size?
For example, say batch is of size 10 and I have the following pseudo code
for i, batch in enumerate(train_loader):
loss = compute_loss(batch)
loss.backward()
if i%9 == 0:
optimizer.step()
scheduler.step()
Would this be equivalent to running just
for i, batch in enumerate(train_loader):
loss = compute_loss(batch)
loss.backward()
optimizer.step()
scheduler.step()
if we had the size of batch 100?