Slow training with nn.DataParallel

Thanks I was saving some tensors in loop.