How could I obtain unscaled gradient of a tensor

coincheung · April 27, 2021, 6:38am

Hi,

I used torch.cuda.amp.scaler to train my model with mixed precision mode. Part of my code is like this:

        eps = 1e-8
        mag = 0.1
        im.requires_grad_(True)
        optim.zero_grad()
        with amp.autocast(enabled=args.use_fp16):
            logits, *logits_aux = net(im)
            loss_pre = criteria_pre(logits, lb)
            loss_aux = [crit(lgt_aux, lb) for crit, lgt_aux in zip(criteria_aux, logits_aux)]
            loss = loss_pre + sum(loss_aux) * 0.5
        scaler.scale(loss).backward()
        #  print(im.grad)
        #  scaler.unscale_(im.grad)
        with torch.no_grad():
            grad = im.grad
            grad = (grad / grad.norm() + eps) * mag
        im.requires_grad_(False)
        im += grad

I need to compute gradient of input image, but I do not know how to unscale it. I am using no optimizer here so I cannot use scaler.unscale_, so how could I compute the unscaled gradient please ?

ptrblck · April 27, 2021, 9:14am

You can manually unscale the gradients as shown in the Gradient penalty section of the amp examples:

inv_scale = 1./scaler.get_scale()
grad_params = [p * inv_scale for p in scaled_grad_params]