[RNNCell vs RNN]

j-min · June 9, 2017, 8:56am

[RNNCell vs RNN]
What is the better way when implementing RNN decoder?
I used to work with tensorflow, so I am familiar with implementing RNN decoder by calling RNNCells for each unrolling step.
However, it seems many implementation calls RNN with input whose seq_len size is 1 for each time step, including official seq2seq tutorial.
As I see, RNN corresponds to tf.nn.dynamic_rnn or tf.nn.static_rnn of tensorflow, which internally call RNNCell for number of given unrolling steps.
I am curious if calling RNN for multiple times causes unrolling slower than calling RNNCell for multiple times.
I found OpenNMT-py used RNNCell for decoding.
Do people even use RNNCells in PyTorch?

smth · June 22, 2017, 3:40am

people do use RNNCell, but largely if your computation fits in the regime of RNN, then people prefer that (because RNN uses CuDNN and is ~3 to 4 times faster).

beneyal · November 6, 2018, 1:23pm

Could you please clarify what do you mean by “the regime of RNN”? I still don’t really understand the difference between the RNN/LSTM/GRUCell family and the “simple” RNN/LSTM/GRU layers…

Thanks!

Mostafa_Elhoushi · June 29, 2020, 5:33pm

Not sure about the definition of RNNCell in Pytorch, but quoting from TF Tutorial on RNN:

In addition to the built-in RNN layers, the RNN API also provides cell-level APIs. Unlike RNN layers, which processes whole batches of input sequences, the RNN cell only processes a single timestep.

RoeAnt · September 22, 2022, 1:27pm

I understand the tensorflow approach. But how can we wrap some custom RNNCell easily in pytorch to create an RNN?