How I can initialize variables, say Xavier initialization?
Hi @Hamid,
I think you can extract the network’s parameters params = list(net.parameters())
and then apply the initialisation you may like.
If you need to apply the initialisation to a specific module
, say conv1
, you can extract the specific parameters with conv1Params = list(net.conv1.parameters())
. You will have the kernels in conv1Params[0]
and the bias terms in conv1Params[1]
.
Another possibility, which is present in the examples, can be found here. This function specifies how the weights should be handled, and the weights are modified in this line.
Another initialization example from PyTorch Vision resnet implementation.
@Atcold Can you give an example of what you mean? Thanks!
@Kalamaya, I believe @fmassa is a cleaner solution.
You traverse all Module
s, and, upon __class__.__name__
matching, you initialise the parameters with what you prefer.
My method presupposes that you know the order of the Module
s in the _modules
OrderedDict()
.
Does it make any sense what I am saying? If it does not, I can try better and with an example.
@Atcold Yes, an example will help here since I am still traversing unfamiliar territory… thank you for your help in advanced! Much obliged. If it helps, I am basically trying to initialize my conv and fully connected layers that I have. (I’d like to do Xavier, or fan_in fan_out etc). Thanks!
You first define your name check function, which applies selectively the initialisation.
def weights_init(m):
classname = m.__class__.__name__
if classname.find('Conv') != -1:
xavier(m.weight.data)
xavier(m.bias.data)
Then you traverse the whole set of Modules
.
net = Net() # generate an instance network from the Net class
net.apply(weights_init) # apply weight init
And this is it. You just need to define the xavier()
function.
A less Lua way of doing that would be to check if some module is an instance of a class. This is the recommended way:
def weights_init(m):
if isinstance(m, nn.Conv2d):
xavier(m.weight.data)
xavier(m.bias.data)
@Atcold another thing, accessing members prefixed with underscore is not recommended. They’re internal and subject to change without notice. If you want to get a iterator over modules use .modules()
(searches recusrively) or .children()
(only one level).
Thanks guys.
I am trying to apply weight initialization to a fully connected network (nn.Linear). I need, however, fan_out and fan_in of this layer. By fan_out and fan_in I mean number of output neurons and input neurons, respectively. How can I access them?
@Hamid, you can check the size of the weight
matrix.
size = m.weight.size() # returns a tuple
fan_out = size[0] # number of rows
fan_in = size[1] # number of columns
@apaszke, thanks for the heads-up! I’m still new to the Python world…
Edit: apply @apaszke fix.
A small note - .size()
is also defined on Variables, so no need to unpack the data. m.weight.size()
will work too.
@Hamid, are you trying to ask something? I am not sure I understand.
Could you also please format your code with three backticks and the word python
, so that I can read what you posted?
@Atcold, by checking if isinstance(m, nn.Linear) it would apply to linear module, correct?
If I call the weight initialization, it would be applied to all layers?
I have a residual module each having 2 linear layers. Then several of these modules.
my code:
def weight_init(m):
if isinstance(m, nn.Linear):
size = m.weight.size()
fan_out = size[0] # number of rows
fan_in = size[1] # number of columns
variance = np.sqrt(2.0/(fan_in + fan_out))
m.weight.data.normal_(0.0, variance)
class Residual(nn.Module):
def __init__(self,dropout, shape, negative_slope, BNflag = False):
super(Residual, self).__init__()
self.dropout = dropout
self.linear1 = nn.Linear(shape[0],shape[1])
self.linear2 = nn.Linear(shape[1],shape[0])
self.dropout = nn.Dropout(self.dropout)
self.BNflag = BNflag
self.batch_normlization = nn.BatchNorm1d(shape[0])
self.leakyRelu = nn.LeakyReLU(negative_slope = negative_slope , inplace=False)
def forward(self, X):
x = X
if self.BNFlag:
x = self.batch_normlization(x)
x = self.leakyRelu(x)
x = self.dropout(x)
x = self.linear1(x)
if self.BNFlag:
x = self.batch_normlization(x)
x = self.leakyRelu(x)
x = self.dropout(x)
x = self.linear2(x)
x = torch.add(x,X)
return x
class FullyCN(nn.Module):
def __init__(self, args):
super(FullyCN, self).__init__()
self.numlayers = arg.sm-num-hidden-layers
self.learning-rate= args.sm-learning-rate
self.dropout = arg.sm-dropout-prob
self.BNflag = args.sm-bn
self.shape = [args.sm-input-size,args.sm-num-hidden-units]
self.res = Residual(self.dropout,self.shape,args.sm_act_param,self.self.BNflag)
self.res(weight_init)
self.res-outpus = []
def forward(self,X):
self.res-outpus.append(self.res(X))
for i in range(self.numlayers):
self.res-outpus.append(self.res(self.res-outpus[-1]))
return self.res-outpus[-1]
sorry about confusion
Correct.
Yup. All linear layers.
Sure, it will apply the normalisation to each Module
that belongs to the class nn.Linear
.
But I have called weight_init once for the class while I call linear layers in a for loop (i.e., there are multiple sets of variables).
net = Residual() # generate an instance network from the Net class
net.apply(weights_init) # apply weight init
I’m not too sure what you’re doing with FullyCN
…
The apply
function will search recursively for all the modules inside your network, and will call the function on each of them. So all Linear
layers you have in your model will be initialized using this one call.
@Atcold, in FullyCN I use several residual modules. In another piece of code, I pass data to FullyCN which returns corresponding output via its forward function.