Problem in training on multi-GPU with DataParallel

The modify_resnets method of pretrainedmodels seems to break nn.DataParallel.
If I remove this line of code in this library, the model works fine, and also the torchvision.models.resnet34 model works fine using your code.

My best guess is that the reassignment of the forward method could break it for some reason.