I input well-formed data into a simple linear layer with normal weights and bias, the output has some ‘nan’ in it. This only happens on Ubuntu18 + PyTorch1.4.0, but on Win10 + PyTorch1.4.0 or Colab, the linear layer works well.
On Ubuntu:
import torch
import torch.nn as nn
model = nn.Linear(6, 8)
x = torch.randn((4, 6))
y = model(x)
print(x)
print(y)
print(model.weight)
print(model.bias)
I get:
tensor([[ 0.4134, -1.0348, -0.7237, 0.6970, 0.0487, 0.1217],
[ 1.2217, 0.2465, -0.0832, 0.5530, -0.5716, -0.4916],
[-0.3497, -0.1285, -1.3026, -0.0845, -0.5295, -0.7776],
[ 1.2792, 0.1692, -1.3023, -1.9925, -0.6869, 0.8478]])
tensor([[ 3.0334e-01, 1.9168e+05, -4.3615e+31, 3.7255e+04, -1.5095e+23,
-6.4485e+34, 1.0371e+34, nan],
[ 3.0324e-01, -7.7407e+05, 1.0390e+31, 4.2809e+03, 6.0958e+23,
-7.4157e+33, -4.1881e+34, nan],
[ 3.0343e-01, -1.2243e+06, -5.4147e+30, 6.7047e+04, 9.6410e+23,
-1.1607e+35, -6.6239e+34, nan],
[ 3.0323e-01, 1.3349e+06, 7.1316e+30, 6.7047e+04, -1.0513e+24,
-1.1605e+35, 7.2227e+34, nan]], grad_fn=<AddmmBackward>)
tensor([[-0.3934, 0.2317, -0.3121, -0.1111, -0.2597, -0.2182],
[-0.2294, 0.3308, -0.1488, -0.1515, 0.1625, 0.0586],
[-0.3984, 0.1251, 0.0899, 0.0807, 0.0110, 0.1535],
[-0.2065, -0.3367, 0.1329, 0.4042, -0.0387, -0.3794],
[ 0.2129, 0.1050, -0.2263, -0.3991, 0.3312, 0.3797],
[ 0.3508, 0.1268, -0.2395, 0.0672, 0.3916, 0.0131],
[-0.1437, -0.4070, -0.0663, -0.1292, -0.0127, -0.0040],
[ 0.0037, -0.0610, -0.0200, 0.2865, -0.0804, 0.1235]],
requires_grad=True)
tensor([ 0.3034, -0.0240, 0.1232, 0.3308, 0.1517, -0.3978, -0.2769, 0.0284],
requires_grad=True)
When I test on Colab or Win10, it returns y like:
tensor([[0.8947, 0.7973, 0.0000, 0.4691, 0.8554, 0.4177, 1.5419, 0.0241, 0.7379,
0.0000, 1.6729, 0.0000, 0.0000, 1.1157, 0.0000, 0.7253],
[0.4217, 0.1511, 0.0000, 0.9463, 0.0000, 0.0000, 0.0000, 1.0647, 0.0000,
0.9710, 0.0000, 1.5806, 0.0000, 0.0000, 1.3374, 0.0059],
[0.0000, 0.7281, 1.6656, 0.0000, 0.0000, 0.6633, 0.0000, 0.0000, 0.0000,
0.0000, 0.0000, 0.0893, 1.3113, 0.0000, 0.5653, 0.0000],
[0.3799, 0.0000, 0.0000, 0.2625, 0.9877, 0.6430, 0.1361, 0.5216, 0.9120,
1.0127, 0.0000, 0.0000, 0.6246, 0.7964, 0.0000, 0.9006]],
grad_fn=<ReluBackward0>)
Could someone help?