How to init parameter of customized layer?

Ardeal · September 9, 2021, 2:16am

Hi,

the following code is my customized layer.
In one of my network, I used the customized layer MSConved and torch.nn.Conv2d in one of my network.

at present, I would like use the torch.nn.Conv2d.weight to init the self.weight of my MScon2d.weight, how to do this?

as I asked in another topic: Nn.parameter re-define question
could parameter in customized layer be re-assigned?

is there any doc about the difference among parameter, tensor and variable in pytorch?

import os
import numpy as np 
import torch
import torch.nn as nn
import torch.nn.functional as F
from torch.nn.modules.utils import _pair
from torch.nn.common_types import _size_2_t

class MSConv2d(nn.modules.conv._ConvNd):
    def __init__(self, in_channels: int, out_channels: int, kernel_size: _size_2_t, stride: _size_2_t = 1 ) -> None:
        super(MSConv2d, self).__init__(in_channels, out_channels, kernel_size, stride)

        self.in_channels = in_channels
        self.out_channels = out_channels    
        self.kernel_size = _pair(kernel_size)
        
        self.weight = nn.Parameter(torch.empty((out_channels, in_channels, *self.kernel_size)))
        self.gamma = nn.Parameter(torch.empty((out_channels,self.num_groups,1,*self.kernel_size)))
        # nn.init.xavier_normal_(self.weight)
        # nn.init.constant_(self.gamma,1)

    def forward(self, x):
        # self.weight = self.weight*self.gamma
        ret = F.conv2d(F.pad(input), self.weight*self.gamma)
        return ret

ptrblck · September 9, 2021, 5:55am

You could assign the values via:

with torch.no_grad():
    self.weight.copy_(conv_layer.weight)

I think this tutorial could be helpful.

Ardeal · September 9, 2021, 6:47am

@ptrblck ,
Thank you!

This code solve my issue. Thank you!

I read the link you pasted above. but it doesn’t answer my question:
whare is the difference of parameter, tensor and variable? is there any doc about that?

@ptrblck , Could you please kindly answer the qestion I asked in the link: Nn.parameter re-define question

ptrblck · September 9, 2021, 6:49am

The short answer is: Variables are deprecated, nn.Parameters require gradients and are automatically registered inside nn.Modules (so that they are returned in their state_dict and pushed to the device via model.to()), tensors are just array-like objects which can require gradients.

Ardeal · September 9, 2021, 6:55am

@ptrblck ，
Many thanks！
You answer confirmed my understanding.
paramerter:

will be added to state_dict, and named_parameters()
will be pushed to device, will tensor be pushed to device via model.to()? do you mean that tensor should be pushed to device manually by tensor.to()?

self.mask = self.mask_weight*temp, by this line, could I keep self.mask_weght and self.mask are in the same memory? will they be upated in sync?
why could I not be able to self.mask_weight = self.mask_weight * temp to re-assign the parameter?
does it mean that nn.parameter could only be defined once, and could not be re-defined?

I got a few quesiton about parameter assignment, could you please kindly answer me?

class SMConv2d(nn.Module):
    def __init__(self, in_channels, out_channels, kernel_size, padding=1, stride=1):
        super(SMConv2d, self).__init__()
    
        self.mask_weight = nn.Parameter(torch.Tensor(out_channels, in_channels, kernel_size, kernel_size))
        nn.init.constant_(self.mask_weight, 1)
    
    def compute_mask(self,temp):
        self.mask = self.mask_weight*temp
        # self.mask_weight = self.mask_weight * temp
       
        return self.mask 

    def forward(self, x, temp=1):
        masked_weight = self.compute_mask(temp)
        out = F.conv2d(x, masked_weight, stride=self.stride, padding=self.padding)        
        return out

Ardeal · September 10, 2021, 6:34am

@ptrblck ,

self.mask_weight = self.mask_weight * temp

Could parameter in customized layers be re-assigned likt the above code?

ptrblck · September 10, 2021, 7:26am

You can reassign parameters, but these will be new parameters and thus not updated anymore.
If you want to manipulate the parameter use:

with torch.no_grad():
    self.mask_weight.copy_(...)

instead.

Ardeal · September 10, 2021, 7:51am

@ptrblck ,

Thank you!

in customized layer, parameter should not be re-assigned by =.
during model loading and initialization, if I use model.named_parameters() to get each parameter, could I use “=” to re-initialize it?

ptrblck · September 10, 2021, 7:59am

The same reasoning applies here as well:
you can re-assign new parameters, but the internal parameter (assigned to self.param) is a new parameter and in case you have already passed model.parameters() to the optimizers etc. this new parameter will not be trained.
It depends on your use case if that’s a concern or not.

Ardeal · September 10, 2021, 9:00am

@ptrblck ,
Thank you!

in summarization:
if parameter is re-assigned in layer or out of layer, parameter will not be parameter and will become tensor. and will not be optimzed

ptrblck · September 10, 2021, 5:41pm

No, this is not correct. The new parameter, will still be a parameter, but since it’s a new object you have to be careful where the old object was already used (e.g. via a reference in the optimizer) and would thus need to update it etc.
The critical point is that you have to consider that the reassignment creates a new parameter (object) and then you should check if any other changes are needed (e.g. adding it to the optimizer).

Ardeal · September 11, 2021, 3:26am

@ptrblck , Thank you

I tried the following example code.
self.mask_weight = self.mask_weight * temp # here, temp is a value, such float temp=3.0

for the upper code, I experienced the error below:

cannot assign 'torch.cuda.FloatTensor' as parameter 'mask_weight' (torch.nn.Parameter or None expected)

class SMConv2d(nn.Module):
    def __init__(self, in_channels, out_channels, kernel_size, padding=1, stride=1):
        super(SMConv2d, self).__init__()
    
        self.mask_weight = nn.Parameter(torch.Tensor(out_channels, in_channels, kernel_size, kernel_size))
        nn.init.constant_(self.mask_weight, 1)
    
    def compute_mask(self,temp):
        self.mask = self.mask_weight*temp
        # self.mask_weight = self.mask_weight * temp
       
        return self.mask 

    def forward(self, x, temp=1):
        masked_weight = self.compute_mask(temp)
        out = F.conv2d(x, masked_weight, stride=self.stride, padding=self.padding)        
        return out

ptrblck · September 11, 2021, 5:26am

As described in the error message, your would need to assign a new nn.Parameter to self.mask_weight, if you really want to replace it:

self.mask_weight = nn.Parameter(self.mask_weight * temp)

Ardeal · September 11, 2021, 7:24am

@ptrblck ,

Thank you!

the upper code will generate a new parameter self.mask_weight. so I need to consider whether the new self.mask_weight is added to optimizer. is my understanding correct?

once I do self.mask_weight * temp, the code return a tensor but not a parameter. If I want to assign it parameter, I should do self.mask_weight = nn.Parameter(self.mask_weight * temp). right?