RNN time complexity issue


I’m trying to train my RNN on a dataset of 540 MB of text with a validation set of 26 MB of text. I run into a complexity issue by how long this is going to take to fully train. For each individual epoch, the number of forward passes needed to traverse the entire dataset would be:

26 MB x (540 MB / 10) + 540 MB = 1.4e15

When I try using a batch size of 128 and sequence length of 100, then each forward pass seems to last about one second. So the total training time would be:

1.4e15 / (128 x 100) / (3600 x 24 x 365) = 3,478 years

So it seems that training this way is not very practical. I can make the forward pass faster by decreasing the batch size, but that makes each batch cover less ground, so the net result seems to be worse instead of better.

Is there some way for PyTorch to speed up training with multithreading? I have an i7 CPU but I’ve never had to make use of it before.

My architecture is two layers of LSTM, size 512 each.