Hello,
In my CNN model there is a MLP which has 4 fully connected layers with last one having just one node. This output I call alpha
and has 2 dimensions as in this sample output of alpha
[[0.1023]]
. I make sure that minimum values never goes below 0.1 using alpha=torch.clamp(alpha,min=0.1,max=2)
I need now need to manipulate some other Pytorch tensor x
with alpha
. Both are in torch.float
type. The tensor x
has the usual 4 dimensions of batch x channel x height x width.
So I have tried the following:-
-
x**alpha[0][0]
: This however causes Nan just after the first iteration for bothalpha
andx
. I tried several other variantsx**alpha
,x.pow(alpha[0][0])
andx.pow(alpha)
. But all had the same problem. How to solve this? -
x*alpha[0][0]
: This however runs nicely I did not encounter any problem even after 100,000 iterations. So why raising to power is problematic while simple multiplication is alright? - Interestingly for
x.pow(alpha.item())
Nan problem goes away but is this the correct way to do? Does this cause problem in backpropagation?
I use a very small learning rate of 1e-4
with Adam Optimiser. I am using Pytorch 1.3.1.
Thankyou verymuch