In my CNN model there is a MLP which has 4 fully connected layers with last one having just one node. This output I call
alpha and has 2 dimensions as in this sample output of
[[0.1023]] . I make sure that minimum values never goes below 0.1 using
I need now need to manipulate some other Pytorch tensor
alpha . Both are in
torch.float type. The tensor
x has the usual 4 dimensions of batch x channel x height x width.
So I have tried the following:-
x**alpha: This however causes Nan just after the first iteration for both
x. I tried several other variants
x.pow(alpha). But all had the same problem. How to solve this?
x*alpha: This however runs nicely I did not encounter any problem even after 100,000 iterations. So why raising to power is problematic while simple multiplication is alright?
- Interestingly for
x.pow(alpha.item())Nan problem goes away but is this the correct way to do? Does this cause problem in backpropagation?
I use a very small learning rate of
1e-4 with Adam Optimiser. I am using Pytorch 1.3.1.