Hello, I’d like to know how to properly implement a custom initialization scheme. For the moment, I’m doing it as below, but from PRs I gather that using the .data
property isn’t recommended.
class MyModule(nn.Module):
def __init__(self, hidden_dim: int):
super(MyModule, self).__init__()
self.hidden_dim = hidden_dim
self.my_param = nn.Parameter(torch.zeros(hidden_dim))
self.reset_parameters()
def reset_parameters(self) -> None:
u = torch.rand(self.hidden_dim) * (1 - 2 / self.hidden_dim) + 1 / self.hidden_dim
self.my_param.data = -(1 / u - 1).log()
As you can see the scheme is a bit complex (it generates a uniform distribution over the inverse of the sigmoid function) so there are no built-in that can directy be used to modify the tensor in place, e.g. via this for the uniform function:
with torch.no_grad():
my_tensor.uniform_(a, b)
Could you tell me how to do it ?