You could save the state_dict and load it for resetting the model. Have a look at the Serialization Semantics to see how to do it.
Would this work for you or do you want to re-initialize it to random weights?
could I perhaps double check if below is a robust solution? it is working - However I does seem useful to corroborate. I am looking for a solution that clears all weights between iterations of a hyper-parameter search. I am running individual models as sub-process’.
chk_dir = '/root/.cache/torch/hub/checkpoints/'
if os.path.isdir(chk_dir):
for chkpnt in os.scandir(chk_dir):
print(f'rm"ing {chkpnt.path}')
os.system(f'rm {chkpnt.path}')
@ptrblck@Brando_Miranda I was trying to do something with regards to resetting the weights and then applying a new tensor as my weights for a particular layer. I just had one doubt regarding the above discussion - does the reset_parameters() also clear all the associated memory that the layer occupies?
.reset_parameters() will reset the parameters inplace, such that the actual parameters are the same objects but their values will be manipulated.
This would allow you to use the same optimizer etc. in case you’ve already passed the parameters to it.
If you are creating a new module, you would of course also reset the parameters, but these parameters are new objects which you might need to pass to an optimizer again (depending on your actual use case).
def reset_model_weights(layer):
if hasattr(layer, 'reset_parameters'):
layer.reset_parameters()
else:
if hasattr(layer, 'children'):
for child in layer.children():
reset_model_weights(child)
Is it possible to apply Xavier (or a similar initialization) to only a subset of weights in a layer? For example, I want to randomly re-initialize 20% of the weights while keeping the rest unchanged. Is there a standard way to do this in PyTorch, or do you have any suggestions?
You can apply a random boolean mask on any tensor in PyTorch, which will then only make the changes where True exists.
import torch
import torch.nn as nn
import torch.nn.init as init
def partial_xavier_init(model: nn.Module, percent: float):
"""
Applies Xavier uniform initialization to a random subset (percent) of weights in each layer.
Args:
- model: The PyTorch model.
- percent: Fraction of weights to reinitialize (0.0 to 1.0).
"""
for name, param in model.named_parameters():
if 'weight' in name: # Apply only to weight parameters
# Create a tensor with Xavier initialization
new_param = torch.empty_like(param)
init.xavier_uniform_(new_param)
# Create random mask
mask = torch.rand_like(param) < percent
# Apply masked update
param.data[mask] = new_param[mask]