How to optimize a model once after several times of small batch's training?

I am training a deep network and because the training data is very big, so I can’t feed model with a bigger batch size. I wander if I can feed each time with a small batch size and optimize the model once after a specified times of loss backward?
If possible, how can i do to average the gradient values before update parameters?
Thank you!!!

1 Like
num_batches = 0
for sample, target in dataset:
    out = model(sample)
    loss = loss_fn(out, target)
    num_batches += 1
    if num_batches == 10: # optimize every 10 mini-batches
        model.zero_grad() # or optimizer.zero_grad()
        num_batches = 0

Hi, smth
It seems backward function only accumulates the gradient, but does not average the gradient?