How to fix/define the initialization weights/seed

isalirezag · June 23, 2018, 9:18pm

How I can fix the weight initialization or define my own initialization for weights?
I read Weight initilzation , but im still not sure how to do it.
Lets say i implemented this simple network: https://pytorch.org/tutorials/beginner/blitz/cifar10_tutorial.html

How can i define the weight to be what i like for the start?

Thanks

ptrblck · June 23, 2018, 9:27pm

You have to create the init function and apply it to the model:

def weights_init(m):
    if isinstance(m, nn.Conv2d):
        nn.init.xavier_uniform(m.weight.data)
        nn.init.xavier_uniform(m.bias.data)
    

model = MyModel()
model.apply(weights_init)

isalirezag · June 23, 2018, 10:04pm

@ptrblck Thank you!!! Makes sense.

Just a follow up question.

Lets say your network is like this:


class MyModel(nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        self.conv1 = nn.Conv2d(3, 6, 5)
        self.pool = nn.MaxPool2d(2, 2)
        self.conv2 = nn.Conv2d(6, 16, 5)
        self.fc1 = nn.Linear(16 * 5 * 5, 120)
        self.fc2 = nn.Linear(120, 84)
        self.fc3 = nn.Linear(84, 10)

    def forward(self, x):
        x = self.pool(F.relu(self.conv1(x)))
        x = self.pool(F.relu(self.conv2(x)))
        x = x.view(-1, 16 * 5 * 5)
        x = F.relu(self.fc1(x))
        x = F.relu(self.fc2(x))
        x = self.fc3(x)
        return x

and you define the weight like this:

def weights_init(m):
    if isinstance(m, nn.Conv2d):
        nn.init.xavier_uniform(m.weight.data)
        nn.init.xavier_uniform(m.bias.data)

Am i putting it in a right place in th following?


class MyModel(nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        self.conv1 = nn.Conv2d(3, 6, 5)
        self.pool = nn.MaxPool2d(2, 2)
        self.conv2 = nn.Conv2d(6, 16, 5)
        self.fc1 = nn.Linear(16 * 5 * 5, 120)
        self.fc2 = nn.Linear(120, 84)
        self.fc3 = nn.Linear(84, 10)

    def weights_init(m):
    if isinstance(m, nn.Conv2d):
        nn.init.xavier_uniform(m.weight.data)
        nn.init.xavier_uniform(m.bias.data)

   def forward(self, x):
        x = self.pool(F.relu(self.conv1(x)))
        x = self.pool(F.relu(self.conv2(x)))
        x = x.view(-1, 16 * 5 * 5)
        x = F.relu(self.fc1(x))
        x = F.relu(self.fc2(x))
        x = self.fc3(x)
        return x

Or you suggest to put it in another way?
I know it is kinda a stupid question, but the reason that im asking it is that I was wondering if if there is a way to put it inside the init(self), but im not sure if it is possible and also not sure if it is even a good idea.
So just was wondering.

Thanks

ptrblck · June 23, 2018, 11:38pm

You can define it as a class function, but usually it’s defined outside of your model so that it can be reused by other models.
Make sure to add another condition and init for the linear layers

jpeg729 · July 8, 2018, 8:23am

I notice that the model uses ReLU activations and therefore plain Xavier initialisation will be suboptimal for the main weights.

A better solution would be to supply the correct gain parameter for the activation.
https://pytorch.org/docs/stable/nn.html#torch.nn.init.calculate_gain

nn.init.xavier_uniform(m.weight.data, nn.init.calculate_gain('relu'))

With relu activation this almost gives you the Kaiming initialisation scheme. Kaiming uses either fan_in or fan_out, Xavier uses the average of fan_in and fan_out.

For the biases though, always initialising them to zero seems like a solid default choice.

banikr · October 22, 2019, 2:02am

After doing this do I also have to do torch.cuda.manual_seed to fix the weights?

ptrblck · October 22, 2019, 7:16am

Setting the seed before initializing the parameters will make sure to use the same pseudo-random values the next time you are executing the script.

I’m not sure, what you are meaning with “fixate the weights”. Could you explain it a bit?

banikr · October 22, 2019, 8:07pm

Hi Ptr,
The network I am using has Dropout, Pooling layers… So when I use weight initialization as follows:

    def weights_initnot(m):
        xavier=torch.nn.init.xavier_uniform_(m.weight.data,init.calculate_gain('relu'))
        classname = m.__class__.__name__
        if classname.find('Conv') != -1:
            xavier(m.weight.data)
    #         print 'come xavier'
            #xavier(m.bias.data)
        elif classname.find('BatchNorm') != -1:
            m.weight.data.normal_(1.0, 0.02)
            m.bias.data.fill_(0)
        elif classname.find('Linear')!=-1:
            m.weight.data.normal_(0.0, 0.02)
            m.bias.data.fill_(0)
        # elif classname.find('Dropout') != -1:

I get errors like there is no weight attribute in Dropout or Pool class. which makes sense.

Then I used the following weight init using isinstance… it kinda solved the problem because I am getting similar weights in specific layer weight index with the same seed.

seed = 1
torch.cuda.manual_seed(seed)


def weights_init(m):
    if isinstance(m, nn.Conv3d):
        torch.nn.init.xavier_uniform_(m.weight.data, init.calculate_gain('relu'))
        # torch.nn.init.xavier_uniform_(m.bias.data)
    elif isinstance(m, nn.BatchNorm3d):
        m.weight.data.normal_(mean=1.0, std=0.02)
        m.bias.data.fill_(0)
    elif isinstance(m, nn.Linear):
        m.weight.data.normal_(0.0, 0.02)
        m.bias.data.fill_(0)

    Net = ResNet3D().to(device)
    Net.apply(weights_init)

On another note:
Do seed_torch and manual_seed work the same way? I have seen some codes use seed_torch.

Seeking Explanation: I was talking to one of my friends about the purpose of fix weight initialization. He said it has no big impact because the data we are feeding in are random because of the DataLoader. And we are saving the best-performing weights anyway so fixed weights are not a biggy.

What do you think about it?

MERAH_Samia · July 19, 2020, 9:34pm

And why you’re using the uniform distribution, please

ptrblck · July 20, 2020, 1:16am

xavier_uniform was just used as an example for no particular reason.
You can of course use whatever nn.init method fits your use case best.

stark · August 27, 2020, 6:55am

I want to initialize the weights for every layer (irrespective of the initialization method) using a constant seed value. How exactly it’s done in Pytorch?

ptrblck · August 27, 2020, 7:01am

If you want to set the same seed before each initialization, you could add torch.manual_seed(SEED) to the weight_init method (before each torch.nn.init call).

stark · August 27, 2020, 7:17am

I want each linear layer weights/biases to be initialized with the constant values. Following is the weight_init() method the way you suggested:

def weight_init(m):
    if isinstance(m, torch.nn.Linear):
        torch.manual_seed(786)
        torch.nn.init.xavier_uniform_(m.weight.data)
        torch.nn.init.xavier_uniform_(m.bias.data)

Applying above to the linear layers in the classifier parts of the VGG19.

model = VGG19()
model.apply(weight_init)

Following error is generated each time I call model.apply(weight_init):

ValueError: Fan in and fan out can not be computed for tensor with fewer than 2 dimensions

Where am I making a mistake?

ptrblck · August 27, 2020, 7:19am

You cannot apply xavier_uniform_ on the bias, as more than a single dimension is needed in the tensor.

stark · August 27, 2020, 7:24am

Initializing the bias values differently will ruin my experiment. Is there a workaround other than initializing them with 0’s/1’s?

ptrblck · August 27, 2020, 7:26am

torch.nn.init.zeros_ or torch.nn.init.ones_ should work.