Hi,
I used torch.nn.
Module.apply()
to initialize the weights and bias for my nn.Sequential() model.
The code is shown below:
def random_weight(shape):
if len(shape) == 2: # FC weight
fan_in = shape[0]
else:
fan_in = np.prod(shape[1:]) # conv weight [out_channel, in_channel, kH, kW]
w = torch.randn(shape, device=device, dtype=dtype) * np.sqrt(2. / fan_in)
w.requires_grad = True
return w
def init_weights(m):
if type(m) == nn.Linear or type(m) == nn.Conv2d:
m.weight.data = random_weight(m.weight.data.size())
model.apply(init_weights)
The problem is that the params will not be updated if I assign the initialization of params like this m.weight.data = random_weight(m.weight.data.size())
. My guess is the weights are assigned every iteration, but I’m not sure.
So I’ve found a solution:
def init_weights(m):
if type(m) == nn.Linear or type(m) == nn.Conv2d:
m.weight.data.copy_ = random_weight(m.weight.data.size())
model.apply(init_weights)
Simply add copy_
and the params start to get updated.
Could anyone help explain this? Thank you.