LSTM - GRU speed on CPU and GPU

Andrea_Ferrari · July 5, 2024, 12:25pm

Dear all,

I trained two neural networks using PyTorch: one with torch.nn.LSTM and the other with torch.nn.GRU. Both models have identical structures and hyperparameters (same number of layers, neurons, etc.). I wanted to test the prediction speed of these models on my laptop (Dell XPS 15 i7-10750H CPU NVIDIA GeForce GTX 1650 Ti).

When using the GPU via CUDA, the prediction speeds are similar, with the LSTM model being slightly slower than the GRU model (0.893404 ms vs. 0.78311 ms on average). However, on the CPU, the prediction speeds differ significantly: the LSTM model is much faster than the GRU model (0.849297 ms vs. 3.635143 ms). Interestingly, the LSTM model’s prediction speed on the CPU is even faster than on the GPU.

I also tested the models on another computer with a GPU, and while the absolute speeds varied, the pattern was the same: similar performance on the GPU and the LSTM much faster than the GRU on the CPU.

How is this possible? Is PyTorch utilizing some sort of optimization for LSTM on the CPU?

Thanks in advance!