RuntimeError: invalid gradient when truncating parameters in early stopping

PeterL · September 7, 2018, 2:46pm

Hi,

I have an ensemble of neural networks employing early stopping. When a neural network is done learning it is removed from the active set of neural networks being trained. A working example in PyTorch 0.3 of how the weights are removed can be seen below. Essentially p.data and p.grad.data is truncated, by removing the proper weights and gradients.

However, in PyTorch 0.4, Autograd now checks if the expected gradient shape matches the returned, causing the script to fail. The error given is

RuntimeError: invalid gradient RuntimeError: Function <function_name> returned an invalid gradient at index 0 - expected shape [x, y] but got [z, w]

What is the proper way of “truncating” or modifying the parameters such that Autograd expects the correct shape?

Thanks

'parameters:'
    for group in optimizer.param_groups:
        for p in group['params']:
            state = optimizer.state[p]
            for k, v in state.items():
                if (isinstance(v, torch.FloatTensor) 
                    or (model.is_cuda and isinstance(v, torch.cuda.FloatTensor)) ):                        
                    state[k] = v[indices]
            p.data = p.data[indices]
            if p.grad is None:
                import warnings
                warnings.warn("Trying to remove weights on a parameter with a NONE grad")
            else:                        
                p.grad.data = p.grad.data[indices]