@tom Hi Thomas,
one strange question is there in PyTorch a function that calculates the square root of the sum of squares of elements of a vector / matrix?
Thank you very much in advance.
Cheers.
Ergnoor
@tom Hi Thomas,
one strange question is there in PyTorch a function that calculates the square root of the sum of squares of elements of a vector / matrix?
Thank you very much in advance.
Cheers.
Ergnoor
Either you spell it out yourself or you could use torch.norm.
Best regards
Thomas
@tom Hi Thomas, two more questions:
weight_decay
which by default is 0
.l1_crit = nn.L1Loss(size_average=False)
reg_loss = 0
for param in model.parameters():
reg_loss += l1_crit(param)
factor = 0.0005
loss += factor * reg_loss
I know also that if you have to optimize to different types of parameters, let us say for example we have spectral
and basis
parameters, with the same optimizer, let us say for example SGD
but with different learning rates you ca use the following:
optim.SGD([
{'params': basis},
{'params': spectral, 'lr': 1e-3}
], lr=1e-2, momentum=0.9)
Now my question is: what happens if you have specific optimizer for some parameters, let us say for orthogonal
parameters,
def orthogonality(x):
'''
Penalty for deviation from orthogonality:
||dot(x.T, x) - I||**2
'''
xTx = torch.mm(x.t(),x)
return torch.sum(torch.sqrt(xTx - identity_like(xTx)))
should I use the code given above for L1 to calculate the penalty and then add it to the loss
function?
lorth_crit = orthogonality(x)
reg_loss = 0
...
If you have a better, more elegant, solution I will be very pleased to have it, always if that does not create any problem for you.
Thank you in advance for your time and support.
Cheers.
Ergnoor
Hello Ergnoor,
re 1. There is nothing wrong with matching, even if it isn’t the fastest op in the world - likely the time spent there is dwarfed by that spent running your network. You could keep a (global? or in each module) list of them around if you preferred that.
re 2. There is nothing special about using l1 or any other loss with only part of your parameters. Just write what seems intuitive to you.
Best regards
Thomas
@tom Hi Thomas and thank you very much for your suggestions.
I would like to have your opinion in regard of the general structure of my code (approx. 100 lines), including some details only for ‘Penalty’ part of it.
If you are willing to have a look at it, how can I present it to you?
Thank you in advance for your time and understanding.
Cheers.
Ergnoor
Hi Thomas, and hope you are doing alright.
I would like to have your opinion also in regard to the PyTorch version of geoSGD optimizer I just coded.
And as for the previous general structure of my code, I forgot to let you know that only roughly 20 lines are those that I am interested more to have your opinion, the rest of the code is for you to have a more clear idea what is going on there.
Thank you again for your time and understanding.
Cheers.
Ergnoor
Hello Ergnoor,
As a rule, I don’t do individual reviews of non-published code (on the forums).
If you have a github link, I might take a look. Also, there might be other who may have better hints for you, too, so I’d recommend to just post a link here…
Best regards
Thomas
Hi Thomas,
and thank you very much for your advice and suggestion. I appreciate it very much and I will make use of them.
Cheers.
Ergnoor
One question in regard with the activation function in RNNs.
In the source code for torch.nn.modules.rnn
there is this part:
_VF = torch._C._VariableFunctions
_rnn_impls = {
'LSTM': _VF.lstm,
'GRU': _VF.gru,
'RNN_TANH': _VF.rnn_tanh,
'RNN_RELU': _VF.rnn_relu,
}
Is there any way how you can add another new activation function like:
_VF = torch._C._VariableFunctions
_rnn_impls = {
'LSTM': _VF.lstm,
'GRU': _VF.gru,
'RNN_TANH': _VF.rnn_tanh,
'RNN_RELU': _VF.rnn_relu,
'RNN_OPLU': _VF.rnn_oplu,
}
Another question is: where can I find the source code for torch._C._VariableFunctions
?
Cheers.
Ergnoor
Hi Ergnoor,
if you want to capture more people’s attention, I’d recommend to start a new forum topic when the focus of your questions move to a new topic.
It won’t work quite as easily. The function you found dispatches to ATen’s C++ code, which is bound to torch._C._VariableFunctions with a fair amount of “magic” (a while ago I wrote a short guide of how to find the C++ source for a given function).
That said we’re trying to make sure that using the PyTorch JIT enables you to code RNNs yourself and have them run fast. To this end, you could use the RNN implementations used for benchmarking for inspiration of how to code up your own.
Best regards
Thomas
Hi Thomas,
and thank you very much for your suggestion and help.
I will have a look at links you provided for me and if I have uncertainties I will ask for help again.
Cheers.
Ergnoor