How I can initialize variables, say Xavier initialization?
I think you can extract the network’s parameters
params = list(net.parameters()) and then apply the initialisation you may like.
If you need to apply the initialisation to a specific
conv1, you can extract the specific parameters with
conv1Params = list(net.conv1.parameters()). You will have the kernels in
conv1Params and the bias terms in
Another initialization example from PyTorch Vision resnet implementation.
@Atcold Can you give an example of what you mean? Thanks!
@Kalamaya, I believe @fmassa is a cleaner solution.
You traverse all
Modules, and, upon
__class__.__name__ matching, you initialise the parameters with what you prefer.
My method presupposes that you know the order of the
Modules in the
Does it make any sense what I am saying? If it does not, I can try better and with an example.
@Atcold Yes, an example will help here since I am still traversing unfamiliar territory… thank you for your help in advanced! Much obliged. If it helps, I am basically trying to initialize my conv and fully connected layers that I have. (I’d like to do Xavier, or fan_in fan_out etc). Thanks!
You first define your name check function, which applies selectively the initialisation.
def weights_init(m): classname = m.__class__.__name__ if classname.find('Conv') != -1: xavier(m.weight.data) xavier(m.bias.data)
Then you traverse the whole set of
net = Net() # generate an instance network from the Net class net.apply(weights_init) # apply weight init
And this is it. You just need to define the
A less Lua way of doing that would be to check if some module is an instance of a class. This is the recommended way:
def weights_init(m): if isinstance(m, nn.Conv2d): xavier(m.weight.data) xavier(m.bias.data)
@Atcold another thing, accessing members prefixed with underscore is not recommended. They’re internal and subject to change without notice. If you want to get a iterator over modules use
.modules() (searches recusrively) or
.children() (only one level).
I am trying to apply weight initialization to a fully connected network (nn.Linear). I need, however, fan_out and fan_in of this layer. By fan_out and fan_in I mean number of output neurons and input neurons, respectively. How can I access them?
@Hamid, you can check the size of the
size = m.weight.size() # returns a tuple fan_out = size # number of rows fan_in = size # number of columns
@apaszke, thanks for the heads-up! I’m still new to the Python world…
Edit: apply @apaszke fix.
A small note -
.size() is also defined on Variables, so no need to unpack the data.
m.weight.size() will work too.
@Hamid, are you trying to ask something? I am not sure I understand.
Could you also please format your code with three backticks and the word
python, so that I can read what you posted?
@Atcold, by checking if isinstance(m, nn.Linear) it would apply to linear module, correct?
If I call the weight initialization, it would be applied to all layers?
I have a residual module each having 2 linear layers. Then several of these modules.
def weight_init(m): if isinstance(m, nn.Linear): size = m.weight.size() fan_out = size # number of rows fan_in = size # number of columns variance = np.sqrt(2.0/(fan_in + fan_out)) m.weight.data.normal_(0.0, variance) class Residual(nn.Module): def __init__(self,dropout, shape, negative_slope, BNflag = False): super(Residual, self).__init__() self.dropout = dropout self.linear1 = nn.Linear(shape,shape) self.linear2 = nn.Linear(shape,shape) self.dropout = nn.Dropout(self.dropout) self.BNflag = BNflag self.batch_normlization = nn.BatchNorm1d(shape) self.leakyRelu = nn.LeakyReLU(negative_slope = negative_slope , inplace=False) def forward(self, X): x = X if self.BNFlag: x = self.batch_normlization(x) x = self.leakyRelu(x) x = self.dropout(x) x = self.linear1(x) if self.BNFlag: x = self.batch_normlization(x) x = self.leakyRelu(x) x = self.dropout(x) x = self.linear2(x) x = torch.add(x,X) return x class FullyCN(nn.Module): def __init__(self, args): super(FullyCN, self).__init__() self.numlayers = arg.sm-num-hidden-layers self.learning-rate= args.sm-learning-rate self.dropout = arg.sm-dropout-prob self.BNflag = args.sm-bn self.shape = [args.sm-input-size,args.sm-num-hidden-units] self.res = Residual(self.dropout,self.shape,args.sm_act_param,self.self.BNflag) self.res(weight_init) self.res-outpus =  def forward(self,X): self.res-outpus.append(self.res(X)) for i in range(self.numlayers): self.res-outpus.append(self.res(self.res-outpus[-1])) return self.res-outpus[-1]
sorry about confusion
Yup. All linear layers.
Sure, it will apply the normalisation to each
Module that belongs to the class
But I have called weight_init once for the class while I call linear layers in a for loop (i.e., there are multiple sets of variables).
net = Residual() # generate an instance network from the Net class net.apply(weights_init) # apply weight init
I’m not too sure what you’re doing with
apply function will search recursively for all the modules inside your network, and will call the function on each of them. So all
Linear layers you have in your model will be initialized using this one call.
@Atcold, in FullyCN I use several residual modules. In another piece of code, I pass data to FullyCN which returns corresponding output via its forward function.