I’m trying to understand the intermediate vectors in network.parameters()
Refer to this post: Explicitly obtain gradients?
Here’s a simple experiment I tried:
ReLU activations, learning rate: 1e-4
features
Variable containing:
278 640 400
[torch.FloatTensor of size 1x3]
label
Variable containing:
1318
[torch.FloatTensor of size 1]
output before update
Variable containing:
117.6748
[torch.FloatTensor of size 1x1]
FF: list(self.network.parameters())
[Parameter containing:
0.2804 0.1939 0.2529
-0.4079 -0.4753 -0.1954
[torch.FloatTensor of size 2x3]
, Parameter containing:
0.3318
0.237
[torch.FloatTensor of size 2]
, Parameter containing:
-0.3810 -0.5609
0.5520 -0.3808
[torch.FloatTensor of size 2x2]
, Parameter containing:
0.1848
-0.4349
[torch.FloatTensor of size 2]
, Parameter containing:
-0.2611 0.7017
[torch.FloatTensor of size 1x2]
, Parameter containing:
0.3922
[torch.FloatTensor of size 1]
]
BP: [x.grad for x in list(self.network.parameters())] after cost.backward()
[Variable containing:
-2.5851e+05 -5.9512e+05 -3.7195e+05
0.0000e+00 0.0000e+00 0.0000e+00
[torch.FloatTensor of size 2x3]
, Variable containing:
-929.8806
0.0000
[torch.FloatTensor of size 2]
, Variable containing:
0.0000e+00 0.0000e+00
-5.1139e+05 -0.0000e+00
[torch.FloatTensor of size 2x2]
, Variable containing:
0.0000
-1684.5527
[torch.FloatTensor of size 2]
, Variable containing:
0.0000e+00 -4.0124e+05
[torch.FloatTensor of size 1x2]
, Variable containing:
-2400.6504
[torch.FloatTensor of size 1]
]
GD: list(self.network.parameters()) after optimizer.step()
[Parameter containing:
0.2805 0.1940 0.2530
-0.4079 -0.4753 -0.1954
[torch.FloatTensor of size 2x3]
, Parameter containing:
0.3319
0.2374
[torch.FloatTensor of size 2]
, Parameter containing:
-0.3810 -0.5609
0.5521 -0.3808
[torch.FloatTensor of size 2x2]
, Parameter containing:
0.1848
-0.4348
[torch.FloatTensor of size 2]
, Parameter containing:
-0.2611 0.7018
[torch.FloatTensor of size 1x2]
, Parameter containing:
0.3923
[torch.FloatTensor of size 1]
]
output after 1 update
Variable containing:
117.7641
[torch.FloatTensor of size 1x1]
What do the intermediate parameters represent?
e.g.
Parameter containing:
0.3318
0.237
[torch.FloatTensor of size 2]