# Initalize the weights of nn.ConvTranspose2d

how shoud I initalize the weights of nn.ConvTranspose2d ? like nn.Conv2d? is this any special for Pytorch

Add another question:Does pytorch require manual weight initialization or pytorch layers would initialize automatically? means:if i do’t initialize the weight or bias ,it is all zero or random value ?

``````for m in self.modules():
if isinstance(m, nn.Conv2d):
n = m.kernel_size[0] * m.kernel_size[1] * m.out_channels
m.weight.data.normal_(0, math.sqrt(2. / n))
elif isinstance(m, nn.BatchNorm2d):
m.weight.data.fill_(1)
m.bias.data.zero_()
``````

infer : the bias will automatically initialize with random value . is that right?

The work I want to do is to build a FCN based on a caffemodel FCN . now i want to initalize the network .as following
my initalization:

``````def weights_initG(m):
for p in m.modules():
if isinstance(p,nn.Conv2d):
n = p.kernel_size[0] * p.kernel_size[1] * p.out_channels
p.weight.data.normal_(0, math.sqrt(2. / n))
elif isinstance(p,nn.BatchNorm2d):
p.weight.data.normal_(1.0, 0.02)
p.bias.data.fill_(0)
elif isinstance(p,nn.ConvTranspose2d):
n=p.kernel_size[1]
factor = (n+1)//2
if n%2 ==1:
center = factor - 1
else :
center = factor -0.5
og = np.ogrid[:n,:n]
weights_np=(1-abs(og[0]-center)/factor)* (1-abs(og[1]-center)/ factor)
p.weight.data.copy_(torch.from_numpy(weights_np))
``````

Question one : how should I initalize the bias of Conv and deconv
Question two :since Pytorch image data is between[0,1],caffe image data is [0,255]. the weight initalization method have any difference with Caffe?

have a look at example dcgan

``````def weights_init(m):
classname = m.__class__.__name__
if classname.find('Conv') != -1:
m.weight.data.normal_(0.0, 0.02)
elif classname.find('BatchNorm') != -1:
m.weight.data.normal_(1.0, 0.02)
m.bias.data.fill_(0)

netG.apply(weights_init)
``````

it should work.

1 Like

@chenyuntc Does pytorch require manual weight initialization or pytorch layers would initialize automatically? I noticed there is .reset_parameters() in the base _ConvNd class, but I didn’t see where this function is called.

1 Like

The parameters are initialized automatically. If you want to use a specific initialization strategy take a look at `torch.nn.init`. I’ll need to add that to the docs.

3 Likes

`reset_parameters()` should be called in `__init__`.

hello,hank you very much for your help, I ask you another question, for the above code, use the normal distribution to initialize the weights. If i want a normal distribution of variables in a certain range, what method can be done? Do I need to customize a distribution function?
Or called Truncated Normal Distribution

maybe just use `clamp_`

``````m.weight.data.normal_(1.0, 0.02).clamp_(min=0,max=2)
``````

ok，thank you！by the way，if I want to clamp [-1, -0.1] and [0.1,1] 。How to operate？

``````a.clamp_(min=-1,max=1)
a[a.abs()<0.1]=t.sign(a[a.abs()<0.1])*0.1
``````

Can anyone please explain how exactly this snippet works?
I’m not able to get how the m.weight.data.normal_ line works
and if I set bias=False while defining my network is it necessary to define m.bias.data.fill_(0) explicitly again?

`.normal_` fills the tensor inplace with values drawn from the normal distribution using the specified mean and standard deviation (docs).

If you set `bias=False`, you don’t have to and in fact cannot call `m.bias.data`, since `bias` will be `None`.

Note, that I would recommend to wrap the parameter manipulations in a `torch.no_grad` block and avoid using the `.data` attribute directly:

``````lin = nn.Linear(10, 10, bias=False)

`torch.no_grad` will disable gradient calculation, so that all operations in this block won’t be tracked by Autograd.
Using the underlying `.data` attribute will most likely work in this case.