I have a simple Pytorch model with a single dense and relu layer.
I set the seed to have a fixed starting weight and also to have a fixed input to the model
as below.
torch.manual_seed(0)
net = nn.Sequential(OrderedDict({"fc1": nn.Linear(20, 2, bias=False),
"relu": nn.ReLU()}))
# obtain the initial weights set
dense_weights = np.array(net.fc1.weight.data.numpy())
print("initial_weight")
print(dense_weights)
# this is the input to be supplied to the model
# of size (20,)
np.random.seed(0)
arr = np.random.rand(20)
print("supplied input")
print(arr)
# these are the desired weights if the model were to converge
# we do this so that we know we can have something to achieve
# if the model were to be trained
np.random.seed(1)
rndw2 = np.random.rand(*net.fc1.weight.shape)
potential_final_weights = rndw2
# this is the label so as to say.
sample_output = np.dot(np.array(potential_final_weights), arr)
# SGD with momentum
optimizer = optim.SGD(net.parameters(), lr=1, momentum=0.9)
optimizer.zero_grad()
output = net(torch.Tensor(np.expand_dims(arr, axis=0)))
target = torch.Tensor(np.expand_dims(sample_output, axis=0))
# use MSE loss
criterion = nn.MSELoss()
loss = criterion(output, target.float())
print("loss obtained")
print(loss)
loss.backward()
optimizer.step()
# updated weights after one training input
updated_weights = np.array(net.fc1.weight.data.numpy())
print("updated_weights")
print(updated_weights)
Now I see two types of results when I run and am pretty confused about why that is the case:
sometimes when I run I get this output:
initial_weight
[[-0.0016741 0.11995244 -0.18403849 -0.16456097 -0.08612314 0.0599618
-0.00443037 0.17729548 -0.01984377 0.05916917 -0.0675769 -0.04395336
-0.21362242 -0.14809078 -0.09217589 0.0082832 0.08839965 0.13416916
-0.15159222 -0.09737244]
[ 0.08121783 0.18568039 -0.04601832 0.16732758 -0.03604169 0.02366075
0.20247063 -0.2074334 -0.14076896 -0.05660947 -0.08716191 0.19319645
-0.14493737 -0.10293356 -0.15622076 -0.2094214 -0.13052833 0.19221196
0.09977746 0.10837609]]
supplied input
[0.5488135 0.71518937 0.60276338 0.54488318 0.4236548 0.64589411
0.43758721 0.891773 0.96366276 0.38344152 0.79172504 0.52889492
0.56804456 0.92559664 0.07103606 0.0871293 0.0202184 0.83261985
0.77815675 0.87001215]
loss obtained
tensor(28.8879, grad_fn=<MseLossBackward>)
updated_weights
[[ 2.6501863 3.575739 2.728507 2.4683082 1.960972 3.1809146
2.109986 4.4863324 4.636564 1.911954 3.7580292 2.5116606
2.5311623 4.3243814 0.2510695 0.42929098 0.1860947 4.1573787
3.608452 4.106516 ]
[ 3.3013914 4.382067 3.490707 3.36444 2.4497607 3.8134565
2.7700217 5.0250616 5.5135403 2.193241 4.5582995 3.2964973
3.1880748 5.3280225 0.26058462 0.30181146 -0.01189651 5.077625
4.6656265 5.2131886 ]]
but sometimes I see very different updated weights though as you can see the initial weights and the input is just the same! Also the loss looks different. Could someone please help me understand the reason for this difference?
initial_weight
[[-0.0016741 0.11995244 -0.18403849 -0.16456097 -0.08612314 0.0599618
-0.00443037 0.17729548 -0.01984377 0.05916917 -0.0675769 -0.04395336
-0.21362242 -0.14809078 -0.09217589 0.0082832 0.08839965 0.13416916
-0.15159222 -0.09737244]
[ 0.08121783 0.18568039 -0.04601832 0.16732758 -0.03604169 0.02366075
0.20247063 -0.2074334 -0.14076896 -0.05660947 -0.08716191 0.19319645
-0.14493737 -0.10293356 -0.15622076 -0.2094214 -0.13052833 0.19221196
0.09977746 0.10837609]]
supplied input
[0.5488135 0.71518937 0.60276338 0.54488318 0.4236548 0.64589411
0.43758721 0.891773 0.96366276 0.38344152 0.79172504 0.52889492
0.56804456 0.92559664 0.07103606 0.0871293 0.0202184 0.83261985
0.77815675 0.87001215]
loss obtained
tensor(27.1065, grad_fn=<MseLossBackward>)
updated_weights
[[-1.6741008e-03 1.1995244e-01 -1.8403849e-01 -1.6456097e-01
-8.6123139e-02 5.9961796e-02 -4.4303685e-03 1.7729548e-01
-1.9843772e-02 5.9169173e-02 -6.7576900e-02 -4.3953359e-02
-2.1362242e-01 -1.4809078e-01 -9.2175886e-02 8.2831979e-03
8.8399649e-02 1.3416916e-01 -1.5159222e-01 -9.7372442e-02]
[ 3.3013914e+00 4.3820672e+00 3.4907069e+00 3.3644400e+00
2.4497607e+00 3.8134565e+00 2.7700217e+00 5.0250616e+00
5.5135403e+00 2.1932409e+00 4.5582995e+00 3.2964973e+00
3.1880748e+00 5.3280225e+00 2.6058462e-01 3.0181146e-01
-1.1896506e-02 5.0776248e+00 4.6656265e+00 5.2131886e+00]]