How to init parameter of customized layer?

Hi,

the following code is my customized layer.
In one of my network, I used the customized layer MSConved and torch.nn.Conv2d in one of my network.

at present, I would like use the torch.nn.Conv2d.weight to init the self.weight of my MScon2d.weight, how to do this?

as I asked in another topic: Nn.parameter re-define question
could parameter in customized layer be re-assigned?

is there any doc about the difference among parameter, tensor and variable in pytorch?

import os
import numpy as np 
import torch
import torch.nn as nn
import torch.nn.functional as F
from torch.nn.modules.utils import _pair
from torch.nn.common_types import _size_2_t

class MSConv2d(nn.modules.conv._ConvNd):
    def __init__(self, in_channels: int, out_channels: int, kernel_size: _size_2_t, stride: _size_2_t = 1 ) -> None:
        super(MSConv2d, self).__init__(in_channels, out_channels, kernel_size, stride)

        self.in_channels = in_channels
        self.out_channels = out_channels    
        self.kernel_size = _pair(kernel_size)
        
        self.weight = nn.Parameter(torch.empty((out_channels, in_channels, *self.kernel_size)))
        self.gamma = nn.Parameter(torch.empty((out_channels,self.num_groups,1,*self.kernel_size)))
        # nn.init.xavier_normal_(self.weight)
        # nn.init.constant_(self.gamma,1)

    def forward(self, x):
        # self.weight = self.weight*self.gamma
        ret = F.conv2d(F.pad(input), self.weight*self.gamma)
        return ret


You could assign the values via:

with torch.no_grad():
    self.weight.copy_(conv_layer.weight)

I think this tutorial could be helpful.

@ptrblck ,
Thank you!

This code solve my issue. Thank you!

I read the link you pasted above. but it doesn’t answer my question:
whare is the difference of parameter, tensor and variable? is there any doc about that?

@ptrblck , Could you please kindly answer the qestion I asked in the link: Nn.parameter re-define question

The short answer is: Variables are deprecated, nn.Parameters require gradients and are automatically registered inside nn.Modules (so that they are returned in their state_dict and pushed to the device via model.to()), tensors are just array-like objects which can require gradients.

@ptrblck
Many thanks!
You answer confirmed my understanding.
paramerter:

  1. will be added to state_dict, and named_parameters()
  2. will be pushed to device, will tensor be pushed to device via model.to()? do you mean that tensor should be pushed to device manually by tensor.to()?
  1. self.mask = self.mask_weight*temp, by this line, could I keep self.mask_weght and self.mask are in the same memory? will they be upated in sync?
  2. why could I not be able to self.mask_weight = self.mask_weight * temp to re-assign the parameter?
  3. does it mean that nn.parameter could only be defined once, and could not be re-defined?

I got a few quesiton about parameter assignment, could you please kindly answer me?

class SMConv2d(nn.Module):
    def __init__(self, in_channels, out_channels, kernel_size, padding=1, stride=1):
        super(SMConv2d, self).__init__()
    
        self.mask_weight = nn.Parameter(torch.Tensor(out_channels, in_channels, kernel_size, kernel_size))
        nn.init.constant_(self.mask_weight, 1)
    
    def compute_mask(self,temp):
        self.mask = self.mask_weight*temp
        # self.mask_weight = self.mask_weight * temp
       
        return self.mask 

    def forward(self, x, temp=1):
        masked_weight = self.compute_mask(temp)
        out = F.conv2d(x, masked_weight, stride=self.stride, padding=self.padding)        
        return out

@ptrblck ,

self.mask_weight = self.mask_weight * temp

Could parameter in customized layers be re-assigned likt the above code?

You can reassign parameters, but these will be new parameters and thus not updated anymore.
If you want to manipulate the parameter use:

with torch.no_grad():
    self.mask_weight.copy_(...)

instead.

@ptrblck ,

Thank you!

in customized layer, parameter should not be re-assigned by =.
during model loading and initialization, if I use model.named_parameters() to get each parameter, could I use “=” to re-initialize it?

The same reasoning applies here as well:
you can re-assign new parameters, but the internal parameter (assigned to self.param) is a new parameter and in case you have already passed model.parameters() to the optimizers etc. this new parameter will not be trained.
It depends on your use case if that’s a concern or not.

@ptrblck ,
Thank you!

in summarization:
if parameter is re-assigned in layer or out of layer, parameter will not be parameter and will become tensor. and will not be optimzed

No, this is not correct. The new parameter, will still be a parameter, but since it’s a new object you have to be careful where the old object was already used (e.g. via a reference in the optimizer) and would thus need to update it etc.
The critical point is that you have to consider that the reassignment creates a new parameter (object) and then you should check if any other changes are needed (e.g. adding it to the optimizer).

@ptrblck , Thank you

I tried the following example code.
self.mask_weight = self.mask_weight * temp # here, temp is a value, such float temp=3.0

for the upper code, I experienced the error below:

cannot assign 'torch.cuda.FloatTensor' as parameter 'mask_weight' (torch.nn.Parameter or None expected)
class SMConv2d(nn.Module):
    def __init__(self, in_channels, out_channels, kernel_size, padding=1, stride=1):
        super(SMConv2d, self).__init__()
    
        self.mask_weight = nn.Parameter(torch.Tensor(out_channels, in_channels, kernel_size, kernel_size))
        nn.init.constant_(self.mask_weight, 1)
    
    def compute_mask(self,temp):
        self.mask = self.mask_weight*temp
        # self.mask_weight = self.mask_weight * temp
       
        return self.mask 

    def forward(self, x, temp=1):
        masked_weight = self.compute_mask(temp)
        out = F.conv2d(x, masked_weight, stride=self.stride, padding=self.padding)        
        return out

As described in the error message, your would need to assign a new nn.Parameter to self.mask_weight, if you really want to replace it:

self.mask_weight = nn.Parameter(self.mask_weight * temp)

@ptrblck ,

Thank you!

the upper code will generate a new parameter self.mask_weight. so I need to consider whether the new self.mask_weight is added to optimizer. is my understanding correct?

once I do self.mask_weight * temp, the code return a tensor but not a parameter. If I want to assign it parameter, I should do self.mask_weight = nn.Parameter(self.mask_weight * temp). right?