Large preformance regression on `0.4.1`

I have a fairly optimized cnn-blstm-crf tagger here with the crf defined here.

On pytorch 0.4.0 using cuda 9.0 and cudnn 7102 I can run a single epoch of the conll 2003 NER task in 21.41 +/- 0.28

When the only thing I change is the version of pytorch to 0.4.1 (the current conda install) a single epoch now takes 27.99 +/- 0.26

Has anyone else noticed anything like this?

Is this on CPU or GPU?

Yes it is on GPU, a GeForce GTX 1080 Ti.

Could you post an issue on the GitHub repo?

1 Like

I can’t make an issue, I filled out the template but the “submit new issue” button is greyed out.
Edit: I forgot to have a title, was able to post it

1 Like