Are those bias vectors?

chenjus · July 2, 2017, 4:48pm

I’m trying to understand the intermediate vectors in network.parameters()
Refer to this post: Explicitly obtain gradients?
Here’s a simple experiment I tried:

ReLU activations, learning rate: 1e-4

features
Variable containing:
 278  640  400
[torch.FloatTensor of size 1x3]


label
Variable containing:
 1318
[torch.FloatTensor of size 1]


output before update
Variable containing:
 117.6748
[torch.FloatTensor of size 1x1]


FF: list(self.network.parameters())
[Parameter containing:
 0.2804  0.1939  0.2529
-0.4079 -0.4753 -0.1954
[torch.FloatTensor of size 2x3]
, Parameter containing:
       0.3318
       0.237
  [torch.FloatTensor of size 2]
, Parameter containing:
-0.3810 -0.5609
 0.5520 -0.3808
[torch.FloatTensor of size 2x2]
, Parameter containing:
 0.1848
-0.4349
[torch.FloatTensor of size 2]
, Parameter containing:
-0.2611  0.7017
[torch.FloatTensor of size 1x2]
, Parameter containing:
 0.3922
[torch.FloatTensor of size 1]
]

BP: [x.grad for x in list(self.network.parameters())] after cost.backward()
[Variable containing:
-2.5851e+05 -5.9512e+05 -3.7195e+05
 0.0000e+00  0.0000e+00  0.0000e+00
[torch.FloatTensor of size 2x3]
, Variable containing:
-929.8806
   0.0000
[torch.FloatTensor of size 2]
, Variable containing:
 0.0000e+00  0.0000e+00
-5.1139e+05 -0.0000e+00
[torch.FloatTensor of size 2x2]
, Variable containing:
    0.0000
-1684.5527
[torch.FloatTensor of size 2]
, Variable containing:
 0.0000e+00 -4.0124e+05
[torch.FloatTensor of size 1x2]
, Variable containing:
-2400.6504
[torch.FloatTensor of size 1]
]

GD: list(self.network.parameters()) after optimizer.step()
[Parameter containing:
 0.2805  0.1940  0.2530
-0.4079 -0.4753 -0.1954
[torch.FloatTensor of size 2x3]
, Parameter containing:
 0.3319
 0.2374
[torch.FloatTensor of size 2]
, Parameter containing:
-0.3810 -0.5609
 0.5521 -0.3808
[torch.FloatTensor of size 2x2]
, Parameter containing:
 0.1848
-0.4348
[torch.FloatTensor of size 2]
, Parameter containing:
-0.2611  0.7018
[torch.FloatTensor of size 1x2]
, Parameter containing:
 0.3923
[torch.FloatTensor of size 1]
]

output after 1 update
Variable containing:
 117.7641
[torch.FloatTensor of size 1x1]

What do the intermediate parameters represent?
e.g.
Parameter containing:
0.3318
0.237
[torch.FloatTensor of size 2]

smth · July 3, 2017, 2:29am

i think your question is misguided. net.parameters() returns all the weight / bias parameters of your network, not intermediate activations. I hope this helps, if not let me know.

chenjus · July 3, 2017, 3:17am

Ok so does [x.grad for x in list(self.network.parameters())] not give me all the gradients in the network?
Oh… are those the bias vectors after the weight matrices
e.g.
list(self.network.parameters())

[Parameter containing:
 0.2804  0.1939  0.2529
-0.4079 -0.4753 -0.1954
[torch.FloatTensor of size 2x3]
, Parameter containing:
       0.3318
       0.237
  [torch.FloatTensor of size 2]

smth · July 3, 2017, 3:40am

yep, looks like bias vectors.

chenjus · July 3, 2017, 12:15pm

Haha sorry for the dumb thread. It’s been a long week… Thanks again. XD