Why are the dimensions of input/output in lstm oriented in the way they are?

googlebot · October 9, 2020, 6:56pm

RNNs use a loop, unrolling the time dimension, so time-major RNNs process contiguous sequential memory blocks. How important that is, depends on implementation, device and tensor size (i.e. whether it fits in cache).