Mixed precision for validation?

Martin_cb · August 10, 2020, 5:59pm

following on from the pytorch tutorials for amp here:

Here is how I apply the amp

scaler = GradScaler()
for data, label in data_iter: 
   optimizer.zero_grad() 
   # Casts operations to mixed precision 
   with autocast(): 
      loss = model(data) 
 
   scaler.scale(loss).backward() 
   scaler.step(optimizer) 
   scaler.update() 


for data, label in data_iter_valid: 
    # ??

I was wondering if/when to apply mixed precision onto the validation data?? Do you need to use scaler on it?
Or is that even necessary? Wouldn’t it be needed if you wanted to keep the peak memory requirements the same during training/validation?

ptrblck · August 11, 2020, 8:47am

The scaler is not necessary during evaluation, as you won’t call backward in this step and thus there are not gradients to scale.
You can still use autocast and also wrap the code in a with torch.no_grad() block to save more memory.