Loss becoming 'nan' when training my model on another system

Hi
I have a semantic segmentation model. this model train fine on my system(on single gpu) but when I am training the same modal and the same parameters on another system (on single gpu and two gpus with Dataparallel) the output and loss is nan, while version of pytorch and cuda is the same in two systems. can you help me? thanks