Support for implementing Meta-Learning Algorithm

Asichurter · May 13, 2020, 4:42pm

When I implemented a meta-learning algorithm called “Reptile”, I got the following error:

File “/home/asichurter/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py”, line 532, in call
result = self.forward(*input, **kwargs)
File “/home/asichurter/GitHub/APISeqFewShot/models/Reptile.py”, line 163, in forward
grads = t.autograd.grad(loss, self.Learner.parameters())
File “/home/asichurter/anaconda3/lib/python3.7/site-packages/torch/autograd/init.py”, line 157, in grad
inputs, allow_unused)
RuntimeError: cudnn RNN backward can only be called in training mode

I indeed know what I am doing because I need to backpropagate and compute the gradient of RNN models like LSTM. It is because this kind of meta-learning algorithm indeed needs these gradients to adapt even in the inference stage! However it seems to be unfortunate that this is not supported by PyTorch framework up to now (I use PyTorch 1.4, Cuda 10.2 with corresponding CuDnn on Ubuntu 18.04). One more thing, even though running “model.train()” can solve this problem, I can not actually do this because I have some normalization layer like Layer Normalization in my architecture and setting it to training mode can somehow cause some problems(it will keep tracking the statistics during evaluation).

To conclude, it seems that PyTorch can’t completely support some keys features to implement meta-leraning algorithm (one more problem I encountered was being unable to compute high-order derivatives on RNN models). Wish some improvements on this!

albanD · May 13, 2020, 4:50pm

Hi,

This is a limitation of cudnn that does not implement it.
You can disable cudnn during the forward computation of that RNN if you want to be able to do that by falling back to our own implementation that does support it

with torch.backends.cudnn.flag(enabled=False):
    # Your rnn forward code