I am writing this message for you as you always have helped me with very good answers.
I am doing Kaggle competitions but I always run on the problem that I can run bigger batch size and get really bad results.
I have two 2080 ti with 11 Gig of memory and trying to run images 300x300 with batch size 8 give me very bad results and with 16 it always tells me that CUDA ran out of memory…
Can you help me, please
You can accumulate gradient for multiple mini batches and then do a single backward pass, to simulate a larger batch size
@carloalbertobarbano how does this works, what do you mean to accumulate mini batches and do single backward pass ? can you elaborate a little and if you have code will appreciate it.
You want batch_size=16, but you can only fit 8 images images in your memory: then you will accumulate the gradients for two mini batches of size 8, and perform the optimization step every two iterations (2*8 = 16). Your code would look something like this:
dataloader = DataLoader(.., batch_size=8, ..)
for i, (minibatch, labels) in enumerate(dataloader):
output = model(minibatch)
loss = criterion(output, labels)
if (i+1) % 2 == 0:
@carloalbertobarbano thanks! let me try it! so I can do mulitple of 8, like
if(i+1) % 4 == 0:
this will be like batch of 32 right ?
@carloalbertobarbano, by any chance would you know how to implement on fastai?
for i, (inputs, labels) in enumerate(training_set):
predictions = model(inputs) # Forward pass
loss = loss_function(predictions, labels) # Compute loss function
loss = loss / accumulation_steps # Normalize our loss (if averaged)
loss.backward() # Backward pass
if (i+1) % accumulation_steps == 0: # Wait for several backward steps
optimizer.step() # Now we can do an optimizer step
model.zero_grad() # Reset gradients tensors
if (i+1) % evaluation_steps == 0: # Evaluate the model when we...
Nope sorry, I don’t know about fast.ai. But that code looks right