I implemented Glove with Pytorch : https://gist.github.com/MatthieuBizien/de26a7a2663f00ca16d8d2558815e9a6
I have a focus on speed on GPU
- No big data transfert between the CPU and the GPU during training
- All embeddings are stored in a matrix
- Large batch size
I train on the newsgroup dataset in 15s with a GTX 1080. The original implementation, https://github.com/2014mchidamb/TorchGlove, needs multiple hours.
Feedbacks welcome!