Network forward output is Nan, without backward

I run the wide-resnet, while input sub. mean and div. std .
Then I checked the input and model parameters, they are seemed normal.
And when I run on GPU:0, it is ok, but run on GPU:1, it is wrong.
I use “CUDA_VISIBLE_DEVICES=0” to switch the GPU.
So what’s wrong here???
I use CUDA-10.0, pytorch1.2.0, cudnn=True

Are you getting the NaN output immediately after the first forward pass?
Could you try to run the code with anomaly detection and check, which layer creates the NaN?

Thanks for your reply, how to run the code with anomaly detection?

I have checked the model, it firstly happened after conv layer…After a conv layer, its value is -inf

Is this issue reproducible? Did you check the input and parameters for NaN values?
If so, could you store the activation and layer state_dict() so that we could have a look?