Training will freeze several time

I used to train my network on the server with two GPUs with the DataParallel model. Today I encountered a freezing problem. Some epochs will freeze and take a minute to be done, and it has slow down the training time significantly! I did not change anything in the code that I had before! What is the problem? I also tried reducing the batch size and do single GPU training; the same thing is happening! Any idea??