Compute full data gradient w.r.t model parameters

Amit_Chandak · November 29, 2020, 12:21pm

Hello All,
I am new to Pytorch and trying to compute gradient using whole dataset w.r.t given model_parameters. Can some one please tell me if the below function is correct.
I have seen few related posts but those were not solving the problem which I am facing.
Thank You in advance

def computeFullDataGradient(loader, model, criterion, cuda=True):

    data_size = len(loader.dataset)
    batch_size = loader.batch_size
    correction_factor = (batch_size * 1.0 / data_size)
    full_gradVec = list()
    for i, (input, target) in enumerate(loader):
        if cuda:
            input = input.cuda(non_blocking=True)
            target = target.cuda(non_blocking=True)

        input_var = torch.autograd.Variable(input)
        target_var = torch.autograd.Variable(target)
        output = model(input_var)
        loss = criterion(output, target_var)
        batch_gradVec = torch.autograd.grad(loss, model.parameters())
        if i == 0:
            for param in batch_gradVec:
                full_gradVec.append(torch.zeros_like(param))

        for param1, param2 in zip(full_gradVec, batch_gradVec):
            param1 += correction_factor * param2

    return full_gradVec

albanD · December 28, 2020, 11:07am

Hi,

Yes that should be computing the sum of the gradients wrt each sample.
btw you don’t need to wrap things into Variables anymore as all Tensors are always Variables now.

Amit_Chandak · December 28, 2020, 12:00pm

Thank You for your help.