Why is Pytorch massively faster than Torch

Hi guys,

I have a question as the name suggests. Previously I used torch for training a small network with 2 LSTM layer each with 16 memory cells, and the time needed to go through all of my training data once is about 1 to 2 hours on GPU.

Now I switched to Pytorch. And training the same network on the same training data on the same GPU for one epoch only takes about 7 mins. So I was wondering what kinds of changes have you made in Pytorch that make it so much faster than Torch7 in this particular case with LSTM?

I have tested my model trained in Pytorch and it works sensibly. So I think I probably implemented my code correctly, so there is no silly mistake in my code.

Cheers,
Shuokai

I’d say that it might be because pytorch uses cudnn LSTM and also hand-fused kernels for LSTM, but I’d not expect it to be so much faster. That’s good news then! :slight_smile:

HI @fmassa,

Yeah, I have realized that previously I was not using Cudnn from Nvidia with my Torch model. But now Pytorch use cudnn by default with GPU right?

Cheers,
Shuokai

Yes, it uses cudnn whenever possible