Mixed precision for validation?

following on from the pytorch tutorials for amp here:

Here is how I apply the amp

scaler = GradScaler()
for data, label in data_iter: 
   optimizer.zero_grad() 
   # Casts operations to mixed precision 
   with autocast(): 
      loss = model(data) 
 
   scaler.scale(loss).backward() 
   scaler.step(optimizer) 
   scaler.update() 


for data, label in data_iter_valid: 
    # ??

I was wondering if/when to apply mixed precision onto the validation data?? Do you need to use scaler on it?
Or is that even necessary? Wouldn’t it be needed if you wanted to keep the peak memory requirements the same during training/validation?

The scaler is not necessary during evaluation, as you won’t call backward in this step and thus there are not gradients to scale.
You can still use autocast and also wrap the code in a with torch.no_grad() block to save more memory.

1 Like