Hey I am trying to maximize Gaussian likelihood in PyTorch, Unfortunately getting very unreasonable results.
Start of the code
mu= Variable(torch.rand(1), requires_grad=True)
sigma= Variable(torch.rand(1), requires_grad=True)
learning_rate = 0.00002
for t in range(100000):
NLL = -torch.sum(torch.log((1/(np.sqrt(2*3.14)*sigma))*torch.exp((x-mu)**2/(2*sigma**2))))
NLL.backward()
if t % 1000 == 0:
print("loglik =", NLL.data.numpy(),"sigma",sigma.data.numpy(),"mu",mu.data.numpy())
mu.data -= learning_rate * mu.grad.data
sigma.data -= learning_rate * sigma.grad.data
mu.grad.data.zero_()
sigma.grad.data.zero_()
Not sure where the problem goes wrong? Do you think this can implemented and solved using techniques like adam optimizers?