Greetings!
Suppose I want to rescale the initialized weight matrix in a certain way like:
class Net(nn.Module):
def __init__(self, R):
super(Net, self).__init__()
self.lin1 = nn.Linear(2, 100)
self.lin2 = nn.Linear(100, 3)
self.lin2.weight.mul_( (R / torch.sqrt(torch.sum(self.lin2.weight.detach()**2 ,axis=1)))[:, None] )
So I want to change the weight of lin2 based on the current elements of the weights. Do I need the detach ()?
Also, code gives the error “a leaf Variable that requires grad is being used in an in-place operation ”, so the mul_ does not seem to work. Can I just reassign the right-hand expression to my weight matrix, i.e. self.lin2.weight = ...
?
crcrpar
(Masaki Kozuki)
March 19, 2021, 2:47pm
2
Hi,
I’d write it in a (maybe verbose) way like
class Net(nn.Module):
def __init__(self, R):
super().__init__()
self.lin1 = nn.Linear(2, 100)
self.lin2 = nn.Linear(100, 3)
with torch.no_grad():
weight = self.lin2.weight.data.clone().detach()
weight.mul_( (R / torch.sqrt(torch.sum(weight**2 ,axis=1)))[:, None] )
self.lin2.weight.data.copy_(weight)
del weight
def forward(self, x):
return self.lin2(F.relu(self.lin1(x)))
A toy colab is here: https://colab.research.google.com/drive/12yqnYoRF1C_of6s8fokTEqK2h9wl-wvj?usp=sharing
1 Like
Thank you, that would do!
Do I even need .data and detach()? Usually, .detach() is enough, is it now?
crcrpar
(Masaki Kozuki)
March 19, 2021, 3:19pm
4
I’m not sure but I did so because I felt I just wanted to use torch.Tensor
and avoid sharing the storage.
1 Like
Yes, don’t use the deprecated .data
attribute as it’s dangerous and can yield unwanted side effects!
2 Likes
Thanks ptrblck.
How would you rewrite crcrpar’s solution, or do you have a better suggestion?
You should be able to remove the .data
usage from your code without any additional changes.