I read the link you pasted above. but it doesn’t answer my question:
whare is the difference of parameter, tensor and variable? is there any doc about that?
The short answer is: Variables are deprecated, nn.Parameters require gradients and are automatically registered inside nn.Modules (so that they are returned in their state_dict and pushed to the device via model.to()), tensors are just array-like objects which can require gradients.
in customized layer, parameter should not be re-assigned by =.
during model loading and initialization, if I use model.named_parameters() to get each parameter, could I use “=” to re-initialize it?
The same reasoning applies here as well:
you can re-assign new parameters, but the internal parameter (assigned to self.param) is a new parameter and in case you have already passed model.parameters() to the optimizers etc. this new parameter will not be trained.
It depends on your use case if that’s a concern or not.
in summarization:
if parameter is re-assigned in layer or out of layer, parameter will not be parameter and will become tensor. and will not be optimzed
No, this is not correct. The new parameter, will still be a parameter, but since it’s a new object you have to be careful where the old object was already used (e.g. via a reference in the optimizer) and would thus need to update it etc.
The critical point is that you have to consider that the reassignment creates a new parameter (object) and then you should check if any other changes are needed (e.g. adding it to the optimizer).
the upper code will generate a new parameter self.mask_weight. so I need to consider whether the new self.mask_weight is added to optimizer. is my understanding correct?
once I do self.mask_weight * temp, the code return a tensor but not a parameter. If I want to assign it parameter, I should do self.mask_weight = nn.Parameter(self.mask_weight * temp). right?