I will post my simplified code first:
import torch
import torch.nn.functional as F
import numpy as np
def argmax(x, axis=-1):
return F.one_hot(torch.argmax(x, dim=axis), list(x.shape)[axis]).float()
loss_function = torch.nn.MSELoss()
def optimization(V, U, E, beta, learning_rate):
V = V.cuda()
U = U.cuda()
E = E.cuda()
V_optim = V.detach().clone()
V_optim.requires_grad = True
optimizer = torch.optim.Adam([V_optim], lr=learning_rate)
dif = 1
epoch = 1
num_epochs = 200
for epoch in range(num_epochs):
optimizer.zero_grad()
Q_R = argmax(U + beta * torch.transpose(V_optim, 0, 1))
loss = loss_function(V_optim, torch.matmul(Q_R * U, E) + beta * torch.matmul(Q_R, V_optim))
loss.backward(retain_graph=True)
optimizer.step()
print(f"Epoch {epoch}, Loss: {loss}")
return V_optim
if __name__ == "__main__":
n = 100
beta = 0.98
alpha = 0.03
delta = 1
kss = ((1 / beta - (1 - delta)) / alpha) ** (1 / (alpha - 1))
k = np.linspace(0.5 * kss, 1.4 * kss, n)
k_reshaped = k.reshape(-1, 1)
c = k_reshaped ** alpha + (1 - delta) * k_reshaped - k
c[c < 0] = 1e-11
c = np.log(c)
V = (np.log(kss ** alpha - delta * kss) / (1 - beta)) * torch.ones(n, 1, requires_grad=True)
U = torch.tensor(c, dtype=torch.float32)
E = torch.ones(n, 1)
learning_rate = 0.002
optimal_V = optimization(V, U, E, beta, learning_rate)
print(optimal_V)
According to the attached code, the (mean) loss after 200 iterations is
Epoch 199, Loss: 3.335637765999877e-09
But the maximum single loss element in the last iteration result is
tensor(0.0006, device='cuda:0', grad_fn=<MaxBackward1>)
Based on this simple learning I wonder:
(1) is there any solution to reduce the loss, especially the single loss, any further?
(2) is this a problem concerning the choice of loss function type or the format of inputs?
Many thanks.