Volatile has no effect. Use `with torch.no_grad():` instead

I was running a github repository, and got this warning “Volatile = now has no effect. Use with torch.no_grad(): instead”.
Here the author has written volatile =False, i.e he is not using gradients at the time of training, but as shown in code, he is using gradients in evaluate_model function.

def to_var(x, volatile=False):
    if torch.cuda.is_available():
        x = x.cuda()
    return Variable(x, volatile=volatile)

a = time.time()
num_epochs = 100
losses = []
for epoch in range(num_epochs):
    for i, (inputs, targets) in enumerate(train_dl):     
        inputs = to_var(inputs)
#         inputs2 = to_var(inputs2)
#         inputs3 = to_var(inputs3)
        targets = to_var(targets)
        
        inputs1=inputs[:,0,:,:]
        inputs1=inputs1.resize(inputs1.shape[0],1,64,32)
        inputs2=inputs[:,1,:,:]
        inputs2=inputs2.resize(inputs2.shape[0],1,64,32)
        inputs3=inputs[:,2,:,:]
        inputs3=inputs3.resize(inputs3.shape[0],1,64,32)
        
        # forwad pass
        optimizer.zero_grad()
        outputs = model(inputs1,inputs2,inputs3)

        # loss
        loss = criterion(outputs, targets)
        losses += [loss.data[0]]


        # backward pass
        loss.backward()

        
        # update parameters
        optimizer.step()

        
        # report
        if (i + 1) % 50 == 0:
            print('Epoch [%2d/%2d], Step [%3d/%3d], Loss: %.4f'
                  % (epoch + 1, num_epochs, i + 1, len(train_ds) // batch_size, loss.data[0]))
            
b = time.time()
print('Total Time of Training {:.1000}s'.format(b - a))

def evaluate_model(model, dataloader):
    model.eval()  # for batch normalization layers
    corrects = 0
    for inputs, targets in dataloader:
        inputs, targets = to_var(inputs, True), to_var(targets, True)
#         targets = to_var(targets)
        
        inputs1=inputs[:,0,:,:]
        inputs1=inputs1.resize(inputs1.shape[0],1,64,32)
        inputs2=inputs[:,1,:,:]
        inputs2=inputs2.resize(inputs2.shape[0],1,64,32)
        inputs3=inputs[:,2,:,:]
        inputs3=inputs3.resize(inputs3.shape[0],1,64,32)
        
        outputs = model(inputs1,inputs2,inputs3)
        _, preds = torch.max(outputs.data, 1)
        corrects += (preds == targets.data).sum()
        
    zz=len(dataloader.dataset)
    
    print('accuracy: {:.2f}'.format(100. * corrects / len(dataloader.dataset)))
    print('corrects: {:.2f}'.format(corrects))
    print('Toatal: {:.2f}'.format(zz))

evaluate_model(model, train_dl)
evaluate_model(model, test_dl)

Now I have done like this below

def to_var(x, volatile=False):
    with torch.no_grad():
         if torch.cuda.is_available():
            x = x.cuda()
    return x

although it is not giving error, but is this right way, because using torch.no_grad will not compute gradients

Does anyone knows this?

volatile=True disabled gradient calculation, so the author wants to compute gradients in their code snippet. You should this remove the no_grad() context.

Oh! I misunderstand this concept.
volatile=True means gradients are not computed
volatile=False means gradients are computed.

Sir, if I remove torch.no_grad(), then it will throw warning message " Volatile has no effect. Use with torch.no_grad(): instead.

  • During the training phase, to_var(inputs) and to_var(targets) are used without specifying the volatile argument, which might imply that gradients are enabled for these tensors.
  • During the evaluation phase, to_var(inputs, True) and to_var(targets, True) are used, which means gradients are not calculated for them.

As gradients need to be computed during training phase and disabled during validation/testing phase, so Sir how should I tackle this

Use plain tensors during training and wrap your validation loop into a with torch.no_grad() context. Remove the Variable usage as it’s deprecated since 0.4.

Okay Sir. Thankyou. I will try it