I have implemented the
pack_padded_sequence version of the SNLI classifier based on the code from examples: https://github.com/pytorch/examples/tree/master/snli.
My code is here: https://github.com/OanaCamburu/SNLI
The running time for an epoch (all rest being equal) is about 4 times higher, while performance has essentially remained the same.
For the longer running time, I suppose it’s because there is more switch between CPU-GPU due to the necessary sortings (https://github.com/OanaCamburu/SNLI/blob/master/train_packedseq.py#L20 and https://github.com/OanaCamburu/SNLI/blob/master/train_packedseq.py#L36) being executed on the CPU. Is it possible to make the code more efficient? Is there any intention to add a
pack_padded_sequence attribute to RNNs in future releases?
For the non-improvement in performance, I know this wasn’t guaranteed but I still expected a bit of an increase. Curious if anyone has experience same behaviour in other tasks.