Training a DAE with an adaptive learning rate with a time decay factor(0.99) and Nesterov’s accelerated gradient. Is there a direct PyTorch optimizer option or a hack which lets me do this?
What do you mean torch.optim.SGD
uses Nestorov momentum already?