Basic autocast usage


I’m a relative beginner compared to many on these fora, so kid gloves please…

I’ve created a Conda env with the latest PyTorch nightly build in a bid to try and use autocast to get a GAN working ( that currently causes memory errors in my setup (RTX 2070).

I’ve added the relevant code as per the docs - @autocast() to both forward passes, scaler = GradScaler() before training loop, and scaling on loss and backward pass inside the training loop. I’m just missing something basic like an import, or maybe I’ve downloaded the wrong nightly build (I used conda install pytorch torchvision cudatoolkit=10.1 -c pytorch-nightly).

Help please!

Thanks and apologies for luddite question,


What kind of error are you seeing and could you please post the PyTorch version you are using via print(torch.__version__)?

They’re very basic errors such as

NameError: name 'GradScaler' is not defined


NameError: name 'autocast' is not defined

current version is 1.6.0.dev20200407

@Mark_Hanslip Glad you’re trying the native API! The full import paths are torch.cuda.amp.autocast and torch.cuda.amp.GradScaler. Often, for brevity, usage snippets don’t show full import paths, silently assuming the names were imported earlier and that you skimmed the class or function declaration/header to obtain each path. For example, a snippet that shows

def forward...

silently assumes you wrote from torch.cuda.amp import autocast earlier in the script.

Try from torch.cuda.amp import autocast at the top of your script, or alternatively

def forward...

and treat GradScaler the same way.

The implicit-import-for-brevity-in-code-snippets is common practice throughout Pytorch docs, but may not be obvious if you’re relatively new to them.

A separate concern is that the loss computation(s), in addition to the forward() methods, should run under autocast (for which you could use the context-manager option with autocast()).

The multi-model example is likely relevant as well. (retain_graph in the example has nothing to do with Amp, it’s present so the non-Amp parts shown are functionally correct, so ignore retain_graph.)

1 Like

@mcarilli thanks so much. I’m more of a musician than programmer so sometimes basic things need clarifying.

I’m keeping an eye on GPU memory consumption with watch nvidia-smi, and there’s barely any difference with and without calling autocast on the model’s forward pass / scaling the loss computation etc.

I’ve tried to implement autocasting as follows, currently I’m hitting memory limits before even reaching loss calculation in the training loop:

def forward(self, z):
out = self.l1(z)
out = out.view(out.shape[0], 128, self.init_size, self.init_size)
img = self.conv_blocks(out)
return img

g_loss = adversarial_loss(discriminator(gen_imgs), valid)
with autocast:

It feels as though I need to recast my inputs at the beginning of the training loop to FP16 (and possibly at the transforms/dataloader stage too?), is that right?


Could you please post the GPU device name?

import torch

yes it’s ‘GeForce RTX 2070’

Your forward pass and loss calculation should be inside autocast, and backward pass should be outside it.

g_loss = adversarial_loss(discriminator(gen_imgs), valid)
should be inside autocast

# Creates model and optimizer in default precision
model = Net().cuda()
optimizer = optim.SGD(model.parameters(), ...)

for input, target in data:

    # Enables autocasting for the forward pass (model + loss)
    with autocast():
        output = model(input)
        loss = loss_fn(output, target)

    # Exits the context manager before backward()

Hi ptrblck i am facing an error while importing autocast.

My code is as follows:
from tqdm import tqdm_notebook, tnrange
import torch.nn.functional as F
import torch
from torch.cuda.amp import autocast 
class Train:
  def __init__(self, model, dataloader, optimizer, stats, scheduler=None, L1lambda = 0,criterion=None,use_amp=True):
    self.model = model
    self.dataloader = dataloader
    self.optimizer = optimizer
    self.scheduler = scheduler
    self.stats = stats
    self.L1lambda = L1lambda
  def run(self):
    pbar = tqdm_notebook(self.dataloader)
    for data1,data2,target1,target2 in pbar:
      # get samples
      data1,data2 =,
      target1, target2 =,
      with autocast():
        output1,output2 = self.model(data1,data2)
        self.loss1=self.criterion[0](output1.float(), target1.half())
        self.loss2=self.criterion[1](output2.float(), target2.half())
        #print("loss1 {}".format(self.loss1))
        #print("loss2 {}".format(self.loss2))
      # In PyTorch, we need to set the gradients to zero before starting to do backpropragation because PyTorch accumulates the gradients on subsequent backward passes. 
      # Because of this, when you start your training loop, ideally you should zero out the gradients so that you do the parameter update correctly.

      # Predict
      #Implementing L1 regularization
      if self.L1lambda > 0:
        reg_loss = 0.
        for param in self.model.parameters():
          reg_loss += torch.sum(param.abs())
        self.loss += self.L1lambda * reg_loss

      # Backpropagation
Error is as follows:
ImportError: cannot import name 'autocast'


torch.cuda.amp is available in the nightly binaries and the current master.
If you are using an older version, you might need to update.

1 Like

@ptrblck hello sir i also posted another question can you please take a look into it.

Why are you not using gradient scaler? These commands should be used additionally?

1 Like

Hi guys,

I just upgraded my pytorch version to use AMP, and while Gradscaler now imports from torch.cuda.amp, I get the following error for autocast:

ImportError: cannot import name 'autocast' from 'torch.cuda.amp' (/home/benjamin/anaconda3/lib/python3.7/site-packages/torch/cuda/amp/

any idea why ?

autocast is currently only available in the master branch and in the nightly binaries.
If you want to try out mixed-precision training you would thus need to build from source or install the nightlies. :wink:

@ptrblck oh my bad, I thought it was part of the library since it had been announced in beginning of May.

Just as a tip, to avoid bothering you in the future: where should I look to see what’s currently in the library and what is only in nightly ? Is there anything on this page or elsewhere that would have given me this info without being annoying on the forums ?

Thanks for your answer !

The warning in the amp docs, point towards the master/nightly builds for the complete mixed-precision training, but I see the confusion.
Unfortunately, we couldn’t land it in 1.5, so both utilities will be available in the next stable release.

Also, don’t worry about asking here, as it’s not annoying at all and we are here for these questions. :wink:

Ok thanks a lot for your answers !

Today in kaggle I installed the Nightly build through the command as mentioned on the page: # conda install -y pytorch torchvision torchaudio cudatoolkit=11.1 -c pytorch-nightly -c conda-forge
However again i am getting the error:

No module named ‘torch.cuda.amp.autocast’

As a second thought: It might also be an installation issue in Kaggle

That might be the case.
Did you get any install logs in your Kaggle environment and if so, did you see which version was installed?
Also, what does this return?

import torch