I understand how to use nn.LSTM
correctly, and I do have access to GPUs. So it might seem weird that I need to train them faster.
I’m trying to apply recurrent layers to reinforcement learning. On standard RL environments, training takes 200k to 1 million gradient updates for non-recurrent agents. To do the same number of updates with a recurrent agent takes way too long, given a backprop through time of 200 to 1000 timesteps.
Are there ways to potentially speed up LSTM training time (even 2x or 3x would help tremendously), apart from (1) using nn.LSTM
instead of nn.LSTMCell
and (2) using GPU instead of CPU?
Thanks!